Improving influencer marketing with data science

Improving influencer marketing with data science

The holy grail of social media marketing is earned media enabled by the reach of Influencers. Some influencers are well known — celebrities and others who qualify for Facebook Mentions — and they are the top 0.1% of the 1/9/90 rule. Marketers typically spend time planning how to create engaging content for the 1% so that they in turn can influence the 9% of “amplifiers” who can then reach the 90% of “lurkers”.

However the identification of Influencers has generally been a blunt instrument, and until recently, social media monitoring platforms were not able to provide the real insights needed to build a competitive advantage.

Typically, up to now, the Influencers reported by social media monitoring tools were lists ranked by various algorithms and dominated by the same high-scoring sites and people across a very broad range of topics. This meant combing through the lists manually and seeking out those who seemed to be a little more distinct — those with a distinct voice and personality who might be approachable.

With the advantage of being able to apply data science to Twitter data we learn that the targeting of influence can be made much more programmatic and more differentiating for planning (sure, Twitter’s not the be-all and end-all but it did account for 99.5% of social media mentions during the most recent Tour de France and it lends itself to data analytics). What data science reveals is that around every topic there are clusters — people grouped into “communities” — who are more like each other than the other people and communities talking about that topic.

This is nice stuff.

Birds of a Spammer flock together

The first nice thing data science reveals is how the spammer networks congregate quite closely. That makes sense, as for every new account they rapidly cross-follow the other spam accounts to build numbers to keep ahead of Twitter shutting them down. And they also become more inventive in their profiles and tweet streams which might delay Twitter but does not fool the data science.

Improving Influencer Marketing with Data Science Image 1


Take a look at this image of how closely a spam network clusters together, in this case right at the centre of the topic. I know it is spam because of the next level of detail, but the image is magic at this less detailed level — a picture certainly tells a thousand words in this case.

Improving Influencer Marketing with Data Science Image 2

The image above shows the clusters or communities which have been identified by data science (in this case from Sysomos) as having more in common with each other than with other members of the search set.

One of the skills in this type of analysis to find the best Influencers is in optimising the search terms. It is from the search terms and the results they bring that the data science applies itself.

The central 2 large communities in the diagram are both large spam networks, and other communities are also “independent” spam communities e.g. one from Indonesia. At the centre of the core spam networks is @mycheapjobss — an account to be avoided at all costs. The good thing about knowing that data science can identify spam networks is that this can be directly applied to topic searches and analysis.

Avoiding Spammer-associated Influencers

We can apply the above knowledge and data science results to improve our influencer marketing by avoiding those who otherwise may have been seen as prime candidates. The image below is from the ICC Cricket World Cup in March this year.

Improving Influencer Marketing with Data Science Image 3

You can see a large cluster — community — on the lower left which is linked to the other communities on the right hand side by a relatively small number of connections, especially 4 key people. The community on the lower left is a spam community which has latched on to the activity surrounding the World Cup.

Now here is the interesting part. Those people that are providing the bridge between the non-spam communities on the right and the spam community on the left would “traditionally” have been rated very highly as prospective influencers. These people rate high in influence about cricket, and are extremely likely to have turned up an “influencer marketing” list. And, in fact, if you examine their profiles they seem like just the right kind of people — passionate, active, real, and high authority.

However, the data science alerts us to be cautious. When the data science is applied to further levels of detail it becomes apparent that most of these “bridges” between the legit World Cup communities and the spammers have been “dragged” towards the spammer community — as the image shows — precisely because they have bought followers from these spammers. The maths is unbiased, it has done the clustering, and these Influencers are connected to the spammers as much as they are connected to the legit cricket communities.

Therefore, the conclusion is that either by directly buying followers, or by being sold a pup by an agency or others who have bought followers, these people are severely tainted and to be avoided as potential Influencers.

You might ask, what’s the problem, they still provide a conduit to the active legit cricket communities, why not use them?

Here are a few reasons to avoid them

  • Firstly as soon as you send anything from your own legit accounts the spammers may pick you up and annoy you more than the 10 fake Linkedin profile connection requests you are getting each day — pain in the butt right?
  • Secondly, if the chosen Influencer retweets your “message” then it will certainly come to the attention of the spam network. This means they will use your Twitter Lists to spam not only you but all your Twitter friends and acquaintances. If the message contains Twitter links to your customer then your customer will also be attacked.
  • Thirdly, the reach and engagement statistics you report about how successful your outreach has been will be vastly over-stated as half or more of that would be simply into the spammer network. Remember, the chosen Influencer is already “infected” — if you examine their account details you’ll see all the spammers in the Follower/Following lists and mentions by these people are worse than worthless.


The rise of data science applied to influencer marketing allows influencers infected by spammer networks to be identified and avoided, and thus helps identify clean legit influencers on the topic of interest.


Subscribe To Firebrand Ideas Ignition Blog

Sign up to receive our new blog posts via email. You'll get all the latest articles on digital, marketing, social media, communications, personal branding — and lots of career advice

Search Website