Webopedia defines big data analytics as "the process of collecting, organizing and analyzing large sets of data ("big data") to discover patterns and other useful information. Not only will big data analytics help you to understand the information contained within the data, but it will also help identify the data that is most important to the business and future business decisions." According to the SAS Institute Inc "big data analytics is the process of examining big data to uncover hidden patterns, unknown correlations and other useful information that can be used to make better decisions. With big data analytics, data scientists and others can analyze huge volumes of data that conventional analytics and business intelligence solutions can't touch". According to Margaret Rouse (2012) big data can show true "customer preferences" and that one of the goals to using big data is " to help companies make more informed business decisions". TerraData states that when big data is done correctly "it is the coming together of business and IT to produce results that differentiate, that power you forward and reduce costs. Big Data is less about the size of the data and more about the ability to handle lots of different data types and the application of powerful analytics techniques" (2014). This means "smarter decisions cut costs, improve productivity, enhance customer experience and provide any organization with a competitive advantage" (TerraData).
So why isn't everyone using big data? Rouse (2012) suggest that it is besause they have "a lack of internal analytics skills and the high cost of hiring experienced analytics professionals" who know tools like Hadoop, Pig, Spark, MapReduce, Hive and YARN. ThoughtWorks Inc. point out that companies need to shift their thinking from the actual data to insight and impact thinking and trying to address unanswered questions. Schmarzo acknowledges that educational institutions are interested in using big data for showing ways to "improve student performance and raise teacher/professor effectiveness, while reducing administrative workload" and to compare one institution to another, but no mention of us on the business side of the house or to learn current LMS usage to compare against a possible replacement. van Rijmenam's infographic shows the benefits on learning, but still no mention of using it for software changes. Fleisher, explains that some institutions are not using it because they have a concern that acknowledging that they recording all learning activities and releasing results may harm students if this data got into the wrong hands. Guthrie points out that big data in respect to education needs to go"beyond online learning, administrators" need to "understand that big data can be used in admissions, budgeting and student services to ensure transparency, better distribution of resources and identification of at-risk students." (2013). Perhaps one could classify technology application purchases as a student service, but I do not think that is what Guthrie is referring to.
Coursera was the one place that mentions the use of big data in education for more than learning. Their course description says includes the statement: "to drive intervention and improvement in educational software and systems". So way aren't leaders doing software comparison, including LMS reviews required to learn big data techniques? I think it is because the top academic administrators are afraid they would find out that some of their decisions based solely on "pilot survey results" were made based on inaccurate data.
For example, Lets assume a institution was currently trying to decide between two LMSs, "The pilot consisted of 11 courses and 162 students. With 39 students, 5 faculty and 1 TA responding to a survey, when asked whether LMS2 or LMS1 was better for teaching and learning the results were":
|LMS2||30/45||67%||(Faculty only 5/7)|
|LMS1||4/45||9%||(Faculty only 0/7)|
|Same||5/45||11%||(Faculty only 1/7)|
|n/a - unsure||6/45||13%||(TA only 1/7)|
Additional Notes: that there were only ll courses for this single semester to use LMS2, out of a total of 2,094 courses. Only 162 students were included in the LMS2 test, out of the total 3,991 students enrolled and only 5 faculty and 1 TA was included in respect to the 780+ faculty on payroll.
At first glance, the 67% sticks out and some may say that is a strong indicator that an institution needs to switch to LMS2 because only 33% wanted to stay with LMS1 or were not sure LMS2 had an increase benefit to change. But that 67% is a percentage based on those that responded to a survey not the number that want to switch. The table says out of "7" faculty yet in the text the person stated that only 5 faculty and 1 TA responded, and the last I check 5+1 is 6 not 7. If you take the total number of participants compared to the number of surveys completed, the 67% is really only based on approximately 27% of those who participated in the pilot. The student population is only represented by ~0.04% and the faculty population by ~0.007%. What about Staff or business entities that use LMS1, they were not represented at all in these results. Other questions that come to mind and decision makers should be asking are: (1) did the faculty who's courses were included actively uses LMS1 to the fullest?, (2) Were the faculty included tech savvy?, (3) Did the included faculty have a personal issue with LMS1?, (4) What actual course included? Were they freshman courses or senior level courses?, (5) what is more important ease of use for faculty or better learning engagement options for students?, (6) Had participants been properly shown how to use LMS1 as they were LMS2?, and (7) What were the features of LMS2 used compared to the used features of LMS1?
I this basic example shows that survey results alone allow for skewed reporting, but add big data analytics to opinion surveys and education decision makers would have a more realistic picture and better decisions for most important stake holder, the student. Garber provides other examples how people are spinning survey results to get their way. In his examples he talks about how some people cherry-picked a statistic describing just a small percentage of a population to make things look better than they are and decision makers need to ask "What did the rest think?" (Garber). In a 2012 paper talk about the need to develop an approach to detect research interviewer falsification of survey data. But that the detection approach was not limited to interviewers and could be applied to basic survey analyst. Robert Oak points out that falsification of figures is more common place in his article about the New York Post claim of falsified unemployment figures. Johnson, Parker, & Clements stated in their research "Likewise, satisfaction that little or no data falsification has been detected previously should not serve as an excuse for failure to continually apply careful quality control standards to all survey operations" (2001). Fanelli's 2009 research showed that "scientists admitted to have fabricated, falsified or modified data or results at least once –a serious form of misconduct by any standard– and up to 33.7% admitted other questionable research practices. In surveys asking about the behavior of colleagues, admission rates were 14.12% (N = 12, 95% CI: 9.91–19.72) for falsification, and up to 72% for other questionable research practices" which would make one think that there is a prevalence of researcher misconduct or did Fanelli mislead us with these results?
Schmarzo states "In a world where education holds the greatest potential to drive quality-of-life improvements, there are countless opportunities for educational institutions to collaborate and raise the fortunes of students, teachers, and society as a whole" (2014) by using big data along with old fashion surveys. The benefits of big data can be felt by all organizations.
- Crudele, John. 2013. Census ‘faked’ 2012 election jobs report
- Coursera, 2013. Big Data in Education. Retrieved on 12/3/2014 from https://www.coursera.org/course/bigdata-edu.
- Fleisher, Lisa. 2014. Big Data Enters the Classroom.
- Garber, Richard. 2014. How to spin the results of a survey.
- Guthrie, Doug, 2013. The Coming Big Data Education Revolution
- Fanelli, Daniele. 2009. How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data
- Rouse, Margaret, 2012, Big Data Analytics. Retrieved from http://searchbusinessanalytics.techtarget.com/definition/big-data-analytics on 12/1/2014.
- SAS Institute Inc, Big Data Analytics. Retrieved from http://www.sas.com/en_us/insights/analytics/big-data-analytics.html on 12/2/2014.
- Schmarzo, Bill, 2014. What Universities Can Learn from Big Data – Higher Education Analytics,
- 2012. Survey Methodology – A statistical approach to detect interviewer falsification of survey data. Retrieved from http://www5.statcan.gc.ca/olc-cel/olc.action?objId=12-001-X201200111680&objType=47&lang=en&limit=0.
- Oak, Robert.2013. New York Post Claims Census Falsifies Unemployment Figures
- Timothy P. Johnson, Vincent Parker, and Cayge Clements, 2001. Detection and Prevention of
- Data Falsification in Survey Research.
- TerraData. Big Data Solutions. Retrieved from http://bigdata.teradata.com/ on 12/1/2014.
- ThoughtWorks Inc. 2014. BIG DATA ANALYTICS. Retrieved from http://www.thoughtworks.com/big-data-analytics on 12/3/2014.
- van Rijmenam, Mark. Big Data Improves Education - Infographic. Retrieved from https://datafloq.com/read/big-data-improve-education-infographic/393 on 12/3/2014.
- Webopedia, big data analytics. Retrieved from http://www.webopedia.com/TERM/B/big_data_analytics.html on 12/2/2014.