Skip navigation.
Home

The Web as a Series of Custom Databases of Knowledge & Transactions

Some web users unknowingly spread programs across large social networks that can easily carry background routines that work to create a DDOS attack on third party sites:

But as the popularity of third-party applications has grown, computer-security researchers have also begun worrying about ways that social-networking applications could be misused. The same thing that makes social networking such an effective way to distribute applications--deep access to a user's networks of friends and acquaintances--could perhaps make it an ideal way to distribute malicious code.

eBay has limited the ability of sellers to give feedback on buyers and increased transaction costs for sellers, which makes their database even less valuable when compared against the rest of the web.

"Seller discontent with eBay is on the rise due to higher fees and other changes, and we believe eBay has seen numerous sellers migrating away from the eBay platform and creating their own selling sites," the Wedge analysts wrote.

General web search, online communities, and the web as a whole share 3 big problems that hinder growth:

  • anonymity - it is hard to trust content if you don't know who creates it. And even if you trust the end publisher, it is virtually impossible to trust everything on their sites if they have large sections of user generated content mixed in with their own editorial content.
    • The WSJ is trying to expand it's scope an influence by turning itself into a social network.
    • Most Amazon.com reviews are good, but one day I saw some horrific looking political forums on Amazon.com! At the same time I have predicted future search related events with great accuracy while being incorrect when writing about other topics. On an individual level most of us are both - experts and ignorant.
    • Affiliate programs and anonymous domain registration make it easy for a person or company to place separation between their company and their marketing.
  • staleness - answers that were once correct, but no longer are...this factor is only compounded by the self-reinforcing nature of search engine rankings where top ranked sites get more exposure and build more signals of quality
  • spam - new information that is so aggressively commercial that Google would rather show stale information than risking ranking it...they want the organic search results to have an informational bias so they can sell ads against them.

These issues are being fought by different people in different ways. Joel Spolsky created Stack Overflow, a community of programmers that vote on the quality of answers. After a 5 week closed beta the service is already superior to most competing sites:

Already, it’s better than other Q&A sites, because you don’t have to read through a lot of discussion to find the right answer, if it’s in there somewhere.

Indeed, you can’t even have a discussion. A lot of people come to Stack Overflow, not knowing what to expect, and try to conduct a discussion when they should be answering the question. The trouble here is that answers are always listed in order of votes, not chronologically, so the discussion instantly becomes scrambled when the votes start coming in.

Instead, we have editing. Once you’ve earned a little bit of reputation in the system (and there are all kinds of ways to earn reputation), you can edit questions and answers.

Being new and easy to edit prevent staleness. And being voted upon by a community of peers limits the reach and benefit that one can get while giving misinformation and/or staying anonymous - while those who add a lot of value to the site will likely want to take credit for their work.

Google is trying to solve some of these issues through connecting people to content via OpenSocial and OAuth. But their position as a gatekeeper fueled by advertising limits how they look at the web. This truth was revealed in their own research before they became the web's advertising engine.

Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users. For example, in our prototype search engine one of the top results for cellular phone is "The Effect of Cellular Phone Use Upon Driver Attention", a study which explains in great detail the distractions and risk associated with conversing on a cell phone while driving. This search result came up first because of its high importance as judged by the PageRank algorithm, an approximation of citation importance on the web [Page, 98]. It is clear that a search engine which was taking money for showing cellular phone ads would have difficulty justifying the page that our system returned to its paying advertisers. For this type of reason and historical experience with other media [Bagdikian 83], we expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers.

Read through Google's recent leaked spam rating documents and you will see that they are more concerned with commerce than truth.

At the core of the PageRank algorithm Google counts links as votes. That which is most remarkable wins, but things are often remarkable because they are extreme and/or incorrect. They offset this by showing a diverse set of results, but showing exceptionally extreme information that either reinforces your worldview or is easily ignorable because it is so extreme in a way that you can not appreciate does not add balance to one's perception of truth.

The issue of information accuracy is one that Tim Berners-Lee, the original founder of the World Wide Web, is looking to solve with the World Wide Web Foundation. In an interview with the BBC, Tim was quoted:

"On the web the thinking of cults can spread very rapidly and suddenly a cult which was 12 people who had some deep personal issues suddenly find a formula which is very believable," he said. "A sort of conspiracy theory of sorts and which you can imagine spreading to thousands of people and being deeply damaging."