Big Data is All The Buzz These Days!

Mike_Talbot

Mike Talbot, Veracity Consulting’s Practice Director, Data Solutions

Big Data isn’t a new thing. Just like the Cloud, Big Data is all the buzz today and is talked about frequently in technology circles. To the non-technical client, all this chatter might be a bit overwhelming. More importantly:  What, if anything, should businesses be doing about it? Are they missing out on some new competitive advantage? Is this another way to lower expenses and possibly improve the bottom line? As companies seek to learn more about this latest trend in technology, it’s important to understand what Big Data really means and how it came about.

Most companies can’t survive for even a few minutes without data; data that is generated and consumed in the form of sales orders, marketing surveys, social media, customer files, employee performance, security feeds, and the list goes on and on.

Not that long ago—just before Cloud technology moved the needle on infrastructure costs, speed to market, and flexibility—the consumption and evaluation of large amounts of data was costly. Design considerations always included efficient ways to organize and consume data. As servers became faster and hard drive space became less costly, companies began to enjoy the luxury of collecting and evaluating more data. Transactional databases became bigger and faster and fed data warehouses that presented a tremendous array of data for business intelligence.

Many folks in the industry would agree that the benchmark event that moved this process from more data to Big Data was when Google cornered the market on the internet search game by collecting and maintaining virtually every site on the internet. Seemingly overnight, Google became the star student at the head of the Internet class.

Volume, VelocityVariety and Veracity!

As Google made its journey to the top, the three Vs of Big Data were born, and have now become the legendary, but accurate, buzzwords that characterize Big Data. Google dealt with a Volume of data on an unprecedented level. The Velocity at which Google needed to consume this data was also a factor. With millions of websites and millions of updates to those millions of websites hourly, Google had serious volume and critical velocity to manage. But now there was also Variety. Up until this time, most data warehouses dealt with a fairly standard set of data types that could be catalogued and indexed for best performance and consumption; images and video were two file types that didn’t easily fit into this mold. In addition to Big Data overcoming previous limits to Volume, Velocity, and Variety, the three Vs established the evaluation criteria to determine whether a business has a Big Data need to be addressed.

Veracity embraced Cloud technology when most companies were unaware of its existence. As we scouted out its benefits to our clients, we did so knowing that, in some cases, a particular business or business sector may not be ready for, or even need, the Cloud. As Cloud technology evolved, we kept pace with our understanding of how Cloud fit into the business of our client partners. Veracity has experienced the same evolution in relation to Big Data.

So How Does It Work?

In a nutshell, Google pioneered a paradigm shift in the use of known technologies. If you think about it, your laptop hard drive stores every file type imaginable, which are easily stored and retrieved. And if you need more space, you can simply add a terabyte USB drive for less than a $100. So, what if you could string together thousands of laptops or PCs? In a way, that’s exactly what Google did: Low cost, easy to replace computers were tied together in clusters. Each computer in those clusters was a node. This new way of seeing technology—and using it—gave Google the propensity to address the limits of Volume and Velocity! Similar to how RAID spreads data across hard drives, Google’s approach was to stripe data across these nodes, so that if a PC failure occurred, it could be replaced in a way similar to hot-swapping a server’s hard drive.

Google stored the data using a concept called Big Table, which used a simple matrix that included a Row key, Column key, and a timestamp. This key design allowed data to continue loading, unimpeded, and eliminated the need for a schema redesign for incoming data. This addressed the Variety.

In the most ingenious application of the divide-and-conquer approach, Google found a new way to use a legacy approach in a new way. Map Reduce was first considered for use in functional programming in the early 1960s; but in this case, it was used to transform this massive pile of collected variable data into a usable form.

As I get a bit into the technical weeds, I should point out that many impressive variations of Google’s success have now appeared on the technology marketplace. One of the most impressive is Hadoop. Named after the toy elephant of the son of inventor Doug Cutting, Hadoop is part of the Apache Open Source Hadoop project. Also, SAS, once known for being a leader in the mainframe space, now enjoys a re-emergence using Big Data. And there are many, many more.

As companies navigate the Big Data landscape, it’s important to understand how Big Data works and if companies really need it. Some companies may think they have a Velocity problem, only to find out they simply need a few schema or hardware changes to their transactional database. In this case, Big Data would be a waste of time and money.

Some companies may not need Big Data today, but as they build and grow their social networks, marketing and overall data volume grows; so, they may arrive at the need to adopt this new technology and product set. The good news is, with the flexibility of the Big Data ecosystem, existing data along with new types of data can be seamlessly consumed.

While I’m at it, we want to send a big thank you to Villanova University for adding Veracity as the fourth V of the Big Data revolution: […] Veracity is an indication of data integrity and the ability for an organization to trust the data and be able to confidently use it to make crucial decisions. […]

The evolution of Cloud technology and the maturation of Big Data make this an exciting time in technology. So, grab your elephant by the ears, hop on, and enjoy the ride!

About Veracity Consulting

Veracity Consulting has a long history of implementing and managing IT solutions and business strategies. The firm has partnered with clients across the nation such as in Austin, Dallas, Houston, Hawaii, Atlanta, Washington D.C., and the Kansas City Metro Area.Veracity Consulting, 8(a) Certified

Veracity offers a comprehensive range of consulting services, including strategic planning, BI strategy, modeling, and business analysis. Its Cloud Infrastructure Managed Services offer service desk management, infrastructure capacity management, and systems administration. Veracity Consulting is widely recognized for developing, engineering, and deploying innovative solutions that include expertise in enterprise data and cloud technologies.

Headquartered in Lenexa, Kansas, Veracity Consulting has principal offices in several locations across the United States. Learn more about Veracity and its team at www.engageveracity.com

About Mike Talbot:  Mike’s technology career began in 1987 with Yellow Freight’s Equipment and Purchasing team.  This key position gave him exposure to the sales and management industries.  Mike has facilitated successful efforts collecting, moving, transforming, and analyzing data for some of the Midwest’s largest firms, many of which involved high profile key infrastructure. His work spans various business sectors, to include the healthcare, transportation, telecom, and finance industries.  For questions about this article or Veracity Consulting’s Data Solutions, email Mike at mike.talbot@engageveracity.com or contact a member of the Veracity Consulting leadership team. 

 Sources:

In a 2001 research report and related lectures, META Group (now Gartner) analyst Doug Laney defined data growth challenges and opportunities as being three-dimensional, i.e. increasing volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources). Gartner, and now much of the industry, continue to use this “3Vs” model for describing big data.

“What Is Big Data | Big Data Explained.” Villanova University. Web. 06 Feb. 2015. www.villanovau.com/resources/bi/what-is-big-data/#.VNVNofnF9e8

“What Is Big Data?” What Is Big Data?  Web. 05 Feb. 2015. http://www.sas.com/en_us/insights/big-data/what-is-big-data.html.