top of page
Writer's pictureMonica Kay Royal

The 10 V's of BIG DATA

Host: Data Science Dojo

Guest: George Firican


Celebrating Data Literacy Month by attending an event hosted by @Data Science Dojo


George Firican is an award winning Data Governance Leader and Founder of LightsOnData He talks about #data, #analytics, #datagovernance, and #datamanagement



This presentation was all about the 10 V's of Big Data George is a VIP who shares a Vast amount of facts about big data, including some Vibrant examples


However, you can't jump right in without talking about the elephant in the room


How big is big data?

  • Smartphones have about 40,000 apps added to the app store each month

  • There are 40,000 Google search queries every second

  • Facebook has 2.2 billion monthly active users

  • Over 1 billion people consume coffee every day (we know George is one of them ☕)

  • 294 billion emails are sent on a daily basis

  • The Milky Way has 300 billion stars

This sure does make data appear to be big, but George thinks that this term can be misleading and likes to think of it as complex data instead


Many of us are familiar with the 3 V's

- Volume

- Velocity

- Variety


Basically, there is a lot of data, it is generated all the time, and there are many different kinds


The other types of data are a little more complex, George gives perfect examples of each


Variability


This is often confused with variety

A coffee shop sells different types of coffee, this is a variety of offerings

However, if you go to that coffee shop 3 days in a row, order the same thing, but each day it tastes different... that is variability


One other really neat example George shared was that variability of data is what makes sentiment analysis so complicated

Since the same word can have several different meanings, it is very hard for a computer to identify it's sentiment without context, which involves human interpretation (for now, until the singularity 🤖)

Veracity


When you get a box of chocolates, how do you estimate how good they are without tasting them?

You can visually look at them, or see who makes them

and only if they came from Switzerland you know they are good 😂


The strange part with this V is that as the previous V's increase, Veracity tends to decrease... plummeting into the unknown


Validity


How accurate and correct is the data for it's intended use

A watch works fine for telling time, but is not accurate to the millisecond so you would likely not be able to identify a tiebreaker from a race


Also, if you are looking at registration data from a conference, you may only see the person's job title or company. Yea this is accurate, but if you want to know the names of the individuals, it is not valid


Volatility


How old is the data before it is considered irrelevant, historic, or not useful

You may need the data to be really old if you are analyzing a trend


On the flipside, you may need to know only the most relevant data which could cause only day-old data too old (or as George calls it, dark data)


Vulnerability


This is all about security and privacy (topics near and dear to my heart)

Some types of data is considered to be PII (personally identifiable data)


Sharing your twitter handle alone is fine, but if you pair it with your name, it becomes personal. Therefore, things can get real bad in the event of a data breach


Visualization


It is very challenging to visualize big data

Not only is it nearly impossible to run traditional graphs when trying to plot a billion points, it is not scalable and impacts response time


Value


Data is meaningless if it does not bring value, but how to express value is the hard part. George shared a few examples on how collecting data can be valuable to a company.


They all seem a little sketchy IMO, but I understand the value (to an extent, but I am super cautious about data with my audit and cybersecurity background)



In Conclusion

To learn more from George, check out his online courses and follow him on LinkedIn


Information from the host, Data Science Dojo:

For upcoming webinars & crash courses, please visit here

For the webinar recordings & queries, professional networking, and data science resources, join us here on LinkedIn

Follow us on Instagram for data science sliders, infographics, memes, and short videos



Happy Learning!!

Recent Posts

See All

Comments


bottom of page