Building a Hadoop data pipeline – Where to start?

Uncategorized
In order to convert data into business value, the data have to be at the forefront of software projects. And you can't limit the data you're using to just the straightforward stuff in RDBMS tables. Valuable data come in structured form (RDBMS tables), but they also come in unstructured (text comments from reviews, logs), and semi-structured (XML) forms. The ability to process and harness all forms of data is crucial for turning them into business value. To have lasting value, all of this must be done in a systematic manner that can be extended, tested, and maintained. Having a data pipeline to crunch the data and distribute results to the business is vital. What is a Data Pipeline? In the general sense, a data pipeline is the process of structuring,…
Read More

Evidence

Uncategorized
I have been watching the dialog about the efficacy of the Course Signals results with interest. I give a tremendous amount of credit to the Course Signals team as I think they have been a positive catalyst for activity in higher ed analytics over the past 7 or so years. I also think it’s healthy to have discussions as to the validity and efficacy of results. If done in a constructive fashion, it will only further the cross-institutional learning that’s happening in our space. The reason I started Blue Canary is that I wasn't seeing enough practical implementations of analytics that produced reasonably sound evidence of positive student outcomes. Hence, this discussion about Course Signals is salient.  Like the e-Literate team, I have also pointed to the Purdue project as…
Read More