Transactional data is still the foundation for many businesses trying to mine data for insights, but big data has opened an entirely new realm of data mining prospects for a multitude of industries.
Instead of simply modeling data, big data provides the opportunity to model human intent, notes Mok Oh, chief scientist of PayPal.
“Ultimately, what we’re trying to model is every person’s brain – at least the part of the brain that decides how to shop, when to shop, and what you want,” Oh says. “We’re trying to reverse-engineer transactional data to figure out what people are going to buy next.”
As an example, a retailer might know that someone has purchased a shirt, but it doesn’t know that he’s looked at computer bags and jeans before buying that shirt, or that his brain is preternaturally focused on computer bags, and the retailer just doesn’t have what he needs. It’s easier to capture browsing behavior in an electronic platform like PayPal, but that still presents reverse-engineering challenges, Oh says.
In addition to retailers maneuvering to boost sales, big data sources are providing applications for many other organizations, including municipalities, to devise practical applications for the technology.
The city of Boston, for example, has put into place an application that taps into big data to help locate potholes and dispatch repair teams to fix them. The smartphone application uses the phone’s accelerometer to detect bumps in the road – when a car hits a pothole, the app sends information about the bump, including its location, to a database.
Here are the top 5 big data source types:
- Social network profiles – Tapping user profiles from Facebook, LinkedIn, Yahoo, Google, and specific-interest social or travel sites, to cull individuals’ profiles and demographic information, and extend that to capture their hopefully like-minded networks
- Social influencers – Editor, analyst and subject-matter expert blog comments, user forums, Twitter, and Facebook “likes,” Yelp-style catalog and review sites, and other review-centric sites like Apple’s App Store, Amazon, ZDNet, etc.
- Activity-generated data – Computer and mobile device log files, aka the “Internet of Things,” including website tracking information, application logs, and sensor data – such as check-ins and other location tracking – among other machine-generated content
- Software as a service (SaaS) and cloud applications – Represent data that’s already in the cloud but is difficult to move and merge with internal data
- Public – The World Bank, SEC/Edgar, Wikipedia, IMDb, etc., data that is publicly available on the Web that may enhance the types of analyses that can be performed
- Subscribe to our blog to stay up to date on the latest insights and trends in big data and data mining.
- Check out our complimentary “5-Minute Guide to Business Analytics” to find out how user-driven “analytic” or “data discovery” technologies help business and technology users more quickly uncover insights and speed action.