What are the key roles in data science? Here's a look at the various job positions in the data science industry and what they mean.
A data scientist cleans, massages, and organizes Big Data, according to DataCamp in the following infographic.
A database administrator, on the other hand, focuses on backup and recovery, data modeling and design, distributed computing, database systems, and data security.
Another role is statistician. This person "collects, analyzes, and interprets qualitative as well as quantitative data with statistical theories and methods," according to DataCamp.
To find out more about the different roles in data science, click or tap on the infographic.
Click Image To Enlarge
COMMENTARY:Data-driven business processes are not a nice-to-have but a need-to-havecapability today. So, if you’re an executive, manager, or team leader, one of your toughest assignments is managing and organizing your analytics and reporting initiative.
The days of business as usual are over. Data generation costs are falling everyday. The cost of collection and storage is also falling. The speed of insight-to-action business requirement is increasing. Systems of Record, Systems of Engagement, Systems of Insight are being transformed with consumerization and digital.
With this tsunami of data and new applications, the bottleneck is clearly shifting from transaction processing to Analytics & Insight-driven “sense-and-respond” Action. This slide from IBM’s Investor Briefing summarizes the data-driven transformation underway in most businesses.
For the first time, Klout has published a paper on how they calculate and measure a person’s influence as an attempt to provide more transparency with its service.
Klout, owned by Lithium Technologies, is a website and mobile app that uses social media analytics to rank its users according to online social influence via the “Klout Score”, which is a numerical value between 1 and 100. For many, its a measure of how active and influential their social profile is across several social networks such as LinkedIn, Facebook, Twitter, Tumblr, and popular blog hosts such as WordPress.
The Klout scoring system assigns scores to 750 million users using machine learning that examines;
9 different social networks on a daily basis
Over 3600 features that capture signals of influential interactions are aggregated across multiple dimensions for each user
Over 45 billion interactions from social networks every day
Adithya Rao, lead research engineer behind the Klout score said.
“Though the Klout Score has been around for years, this is the first time we have published a paper that provides this much detailed information on what’s behind it. Getting this paper accepted is a step towards a more transparent view into the Klout score, and by publishing this information in a public forum others will be able to build upon the work we’ve done.”
Although there are multiple ways to consider what constitutes ‘influence’, how Klout defines it is as a way to to measure a users ability to drive action from their posts or social interactions. Many ‘influencer lists’ for example pull together people active on social media platforms based on the number of followers, ratio between followers and users they follow or simply by how many times they appear via a hashtag search. Klout digs a lot deeper by listening to the signals behind the noise using machine learning and data science to distinguish between people who claim to be an influencer and others who are more insightful the real influencers.
Who responds to your messages and interactions also adds to the algorithms behind Klout, for example, if Elon Musk responds to a message posted then they are influenced by your message and have acted on it. This is weighted far differently than if someone who holds little influence across multiple domains, since Klout also validate that users with higher Klout Scores are able to spread information wider in a network.
When a user registers on Klout they associate their social identities on different social networks with the Klout profile;
For Twitter, public data is collected via the Mention Stream;
Anonymised data for opted-in Lithium Communities comes from in-house datastores
Data for other social networks is collected via REST APIs on the user’s behalf, based on the granted permissions.
Data is collected continuously from interactions in a trailing window of 90 days using the Play Framework. The collected data is written out to a distributed file system, where batched parsing and processing is done using Hadoop MapReduce and Hive. From there the users Klout score is calculated and weighted according to further factors;
The difference between the current time and the time at which the reaction occured.
The social network on which the reaction was performed.
The unit of original content or action on which the reaction was performed.
The type of reaction.
Of course, there are still some elements missing from the scoring mechanism itself which Klout is working on integrating. When I spoke with Adithya he said that in the future Klout would be looking to attribute content from reputable sources like Forbes and weight this accordingly along with the other networks (Forbes has a Klout score of 99 for example, being associated with Forbes content would credit a contributing writer like myself with additional influence.)
Sentiment also isn’t tracked currently – negative or positive, which Adithya also acknowledged is a measure of influence but one which is very nebulous to calculate, since people will react differently to certain content. And it all hangs on the availability of the information to power the framework itself through the APIs provided by the social networks. Real-time and near real-time measurement is possible with some, but for example in the case of LinkedIn, Klout’s abilities are limited to certain interactions and data, such as the users job title and status updates. A users LinkedIn blog posts for example aren’t taken into account yet.
Where Klout’s strengths now lie is in the way it will tie in a users Klout score with transactional information and their overall digital footprint online, allowing businesses who use Klout to reach audiences and customers with a more personalised and contextual experience.
Years ago Klout was purely used to measure and compare people online in a digital bragging rights context. There were other examples where a potential recruitment candidate’s Klout score was looked at for certain role profiles which demanded a command of an online audience or deep understanding of social networks.
Their next challenge now is to educate the wider online community and average user that they need Klout as part of their own generated data as retail and consumer companies seek to create and engage on deeper and personal levels. To an extent, many current users haven’t outgrown trying to play the system just for higher scores, and wear a Klout score like a birthday badge. I wrote about this on Forbes a while back in a previous article on social media habits.
Klout is a classic example of behavioural modification, where emphasis is placed on posting content that you know will hit the spot for a larger audience to get a push up the ladder, rather than post something you know will be engaging to the people that really matter (and consequently won’t even garner a blip on the Klout graph.)"
But Klout has now matured beyond this kind of audience now, and as a first step the publication of this paper of how their data scientist team employs sophisticated algorithms and machine learning techniques proves that Klout and Lithium Technologies is much more than just a number.
COMMENTARY: This is not the first time that Klout has changed its social influence algorithm. After numerous complaints, Klout changed the algorithm in 2011, 2012 and 2014. What I find very disturbing is that Klout places too much emphasis on the number of followers or fans that you have. I keep reading that users who bought followers on Twitter raised their scores significantly. Here are a few examples:
@BarackObama - President Barack Obama is the #POTUS, but according to Klout he commands a score of 98. However, twitteraudit reports that the #POTUS has 37% audit score, claiming that over 41 million of his Twitter followers are "fake."
@HillaryClinton - Former Secretary of State Hillary Clinton has a Klout score of 94. However, twitterdit reports that Madam Secretary has a 59% audit score, claiming that over 1.8 million of her Twitter followers are "fake."
@SenSanders - Senator Bernie Sanders has a Klout score of 81 with a 90% audit score, with only 90,000 "fake" followers.
Granted, both Barack, Bernie and Hillary command tremendous influence by the nature of their current and past political resumes. However, a lot of their tweets, if not most of them, are automated. This is not a level playing field, if you tweet 3 to 5 times per day like I do. Does Klout take the volume of tweets into account?
When I first joined Klout, I got a score of 40, which is considered an average social influence score. At that time, I spent most of my time tech blogging, and at one time achieved 40,000 unique visitors to my blog over a period of about one year. I was averaging about two to three blog posts per day at that time. Half of my fans were from the U.S. the remainder were international, primarily from the U.K. Canada, the E.U. and India. I average 1.5 blog posts per day, and now average about 18,000+ unique visitors. My Klout score has gradually increased from 40 when I started, to 44, 46, and 47 as of today.
I spend most of my time on Twitter where I now have 2,800 followers (96% authentic). I really never liked Facebook, so I only have 64 fans on Facebook, and this is deliberate on my part, and most of my posts from Twitter are fed to my Facebook page. I also have a separate Facebook page, my Sharon Stone page where I am getting about 300-360 likes per week. The Basic Instincts star is quite a draw. Klout can only key on one Facebook page to measure social influence. As a result, I have switched from my Facebook home page to my Sharon Stone page. I also have a lot of views on my LinkedIn page. Hopefully, the combination of Twitter, LinkedIn an Sharon Stone (Facebook) will increase my Klout score. Get back to you on this one, but it should be interesting. My goal is to hit a Klout score of 50 within 30 days.
Courtesy of an article dated October 30, 2015 appearing in Forbes Tech