Social Search and the Integrity of the Social Graph

The “Elemental Web” was a connection of machines, then a connection of sites, now it is a complex amalgam awash with traditional links and millions of ‘inter-personal’ connections defined by the Social Graph. But what exactly is the Social Graph, is it open to manipulation and how might this affect experimentation in Social Search? How shall we seek to vanquish the Social Chimera?

Defining the Social Graph

First let me define the Social Graph and the fundamentals of Social Search. The former is the connection of people and the defining relationships. It is built on new algorithms such as the Social Graph API as well as “older” technologies such as XFN and FOAF. It puts the ‘human face on linking’. Its emergence is driven by the uptake of Social Platforms (blogs, microblogs, Social Networks) the platforms on which the Social Graph is expressed and lives. Every participant has their own Social Graph (to whom and how they are connected), and the superset is a connectivity mesh of immense complexity (around the world in Six Degrees). Social Search seeks to leverage information within the Social Graph to provide improved relevance of results. Simplistically, we are influenced by our connections, therefore recommendations from trusted ‘friends and digital acquaintances’ have an obvious relevance and appeal. Trust, relevance and measurable contribution are key areas on which to focus. For an excellent introduction to the topic I recommend “Connected: The Surprising Power of Our Social Networks and How They Shape Our Lives” by Dr. Nicholas Christakis and Dr James Fowler, as well as “Trust Agents” by Chris Brogan and Julien Smith.

Connected: The Amazing Power of Social Networks and How They Shape Our Lives

Trust Agents: Using the Web to Build Influence, Improve Reputation, and Earn Trust

A range of start-ups (including some notable failures) have been active in Social Search experimentation (Sproose, Mahalo, Jumper 2.0, Wikia Search, Qitera, Scour, Wink, Eurekster, Baynote, Delver, OneRiot and SideStripe). Of ‘more’ interest however are ongoing trails in Google Labs…

Travelling back through the annals of time we recall a “Web of Machines” an interconnected backbone and fabric, the foundations of the modern Internet. Search Engines arose, linking strategies developed; soon links between sites became tradable commodities. Reciprocal linking remains a popular SEO (search engine optimisation) strategy (you link to my domain, I link to yours). A standard Webmaster to Webmaster behaviour was soon under significant ‘manipulation’ (aka gaming) by link spammers and link farms. This targeted link popularity at a domain level, but the ‘take home’ is that once search algorithms are (at least partially) understood, and there is benefit in higher placement in search engine results, then ‘algorithmic gaming’ is ever present.

The Social Web has given rise to a new form of linking, inter-personal links within the Social Graph. “You follow me, I follow you” is a personification of ‘old school’ reciprocal linking. The domain is no longer the ‘back link’; it is the ‘personal’ connection in the Social Graph. It could be argued that this is just good social graces, I’m interested in you, and therefore you should express and reciprocate the same interest. As with link farms and link spamming in the “pre-Social Web”, we are of course seeing a volume of similar misbehaviour affecting the Social Graph across today’s Social Platforms.

Analyzing Social Networks

Social Search Algorithms

My concern and what I want to highlight in this piece is the potential to skew emerging Social Search algorithms, and how they must account for ‘hyper-connected gaming’. Naturally what motivates this (mis)behaviour is ‘short cutting’, in other words rather than build up a following organically through ‘service to the connected community’, you simply ‘snowball’ a following using automated techniques such as ‘mass following’. Twitter is the ultimate sandpit. If not fuelled by it, it could certainly be argued that it is well lubricated with snake oil. This is not in itself a criticism of Twitter’s model, but rather recognition that auto-pilot users (often “mavens” or “work at home” marketing specialists) are ‘swamping’ the platform with all manner of affiliate schemes which they promote through mass communication and mass following techniques. This is not what I classify as pro-social behaviour.

The great joy of the Social Web of course is that people and behaviours can be ignored and dismissed. Un-friend, un-follow, block are all readily available choices. Surprisingly however, research shows that we rarely do much housekeeping on our online networks and hence the Social Graph is additive rather than truly reflective.

With that précis of how I view some online social or really anti-social behaviour, I return to my concern of how the Social Graph is open to manipulation. I’ve written on numerous occasions about proactive Social Networking and how I feel this is often beneficial.

Connections extend possibilities, but there is a value to those connections and indeed how we behave in the social context of those connections, be that by social graces (etiquette) or through positive contribution to group and community dynamics. I very much view proactive or speculative networking on sites like Twitter to be very useful. Indeed my metaphor for such is a “tap on the shoulder”; Twitter being particularly valuable in this regard due to its non-invasive nature.

Go Social on Google Labs

I am currently participating in the Social Search experiment on Google Labs, and it is through this that we must seek to vanquish the Social Chimera’s influence. The principle of the experiment is “more easily find relevant blogs, reviews and other public content from your social circle”. The social circle is determined by the Social Graph, for the purposes of this experiment being links and connections found within Google Profile. In my case this points to all of my Social Site presences such as LinkedIn, Twitter, Facebook, YouTube. You begin to see the potential, the more I am connected within the ‘graph of others’ the more likely that my recommendations and interests show up in the Social Search results of others (establishment of motive and opportunity). Manipulation of this centrality might therefore yield increased influence or (heaven forbid) opportunity to drive monetisation through questionable affiliate schemes. This presents problems; new motivations to drive hyper-connectivity (now inter-personal), a need to filter the Social Graph and the Social Search results and clear them of the behaviours associated with such manipulation.

Gaming the Algorithm

The good news is that Google is astute and experienced in recognising, accounting for and penalising algorithmic gaming. But all is not simple. There are some very pro-social characters heavily involved in Social Media that have 100,000+ followers on Twitter. Many also follow the same number. Does this denote egalitarianism and utility, or something less admirable? It certainly does not provide transparency (in the link alone) to the utility and nature of that relationship to all 100,000 connections. We need to look therefore at relationship reinforcement in the Social Graph. If two people are tagged in a photograph they are “close by definition”, multiple conversations, multiple connections across disparate social sites also reinforce connections. But this still lacks sophistication as it negates (or dilutes) the role of the influencer. Such relationships might be more characteristically ‘one-way’, but none the less I might be more interested in the Social Search result of an influencer rather than a weak connection. It is also difficult to ascertain current “levels of manipulation” and how people within those networks should be accounted for (or discounted). Twitter seems endemically littered with ‘friend collectors’, fuelled by an insatiable (and mistaken) hunger for collection of worthless and highly contrived influence. So this presents the dilemma of how to filter the signal from the noise. This epitomises what I believe to be a principle challenge of Social Search.

Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications

What is noise, what is the signal and how is this algorithmically quantified across a vast array of differing Social Graphs, how do we qualify and ‘level’ the meaning and importance of relationships?

Beyond some of the basic reinforcement checks I described earlier, I suggest semantic analysis, sentiment analysis, measuring utility in  relationships and contributions are of primary appeal for research and development. Personal control is also important. It could be argued that this is intrinsic in controlling personal Social Graphs at their source, but this involves restraint and is very difficult to visualise (it is also not a great deal of fun and constrains the potential of proactive networking). Control over the search results and how these feed through the Social Graph to the results of others (i.e. privacy control) also needs to be thought through. This could lead to the creation of additional Inference Channels, which we may prefer remained ‘hidden’.

I encourage you to engage in the Social Search experiment, at the same time ruminating on your perception of social participation on the Web. The motivations of others naturally need to be questioned, but drive your networking on the basis of pro-social activity. Share, contribute, grow, but be cognisant of the Social Graph, its emerging centrality to Social Search and a need to preserve its integrity. Equally, follow with interest further debate concerning algorithmic tuning to ensure social results are not “manipulated” by hyper-connectors. Ponder also Google’s strategy in Social Networking.

How Google Works

Google’s Social Strategy

Create profiles, connect with others, collaborate and share: Google Profiles, GMail, Google Reader, Google Groups, Google Side Wiki, Google Talk, YouTube, Picasa, Google Wave, Social Search and so forth. There is no direct landing page or dedicated Portal (with the exception of Orkut), but ‘all of the above’ sounds a lot like a decentralised, feature rich Social Networking platform! Could it be argued that Google’s lack of explicit “overarching site” is leading to more natural social interaction and a purer emergent Social Graph from those actions?

I might typically end with a rhetorical “will Google be the ‘glue that binds’?”, but certainly it already is. It is the ultimate in “decentralised” Social Networking, Social Search being a tantalising addition if utility and purity can be appropriately delivered. Page Rank established a mechanism for quantifying importance and authority of sites. Will “Page Rank for People” emerge as a ‘solution’ to Social Search manipulation?

Article first published on the Atos CIO Blog, Feb 2010

Further Reading on Social Networks