No, this equation isn’t the foundation of the Grand Unified Theory. It’s just a simple way of getting LMS activity data into a Learner Record Store where it can be analyzed. Arizona State University is among institutions that are pushing the learning analytics envelope, and continuing down that path requires agility when it comes to accessing and analyzing data. To address this agility, the Action Research Lab and the University Technology Office partnered to build a cloud-based LRS using the Amazon Web Services cloud platform. The goal of this post is to illustrate the steps we took to get Blackboard LMS data loaded in to a cloud-based Amazon Redshift database.
Blackboard has long since committed to the IMS Caliper standard, so extracting events from the LMS is easy. Unfortunately, there isn’t a “Getting Caliper Events from Blackboard to Redshift For Dummies”, so we had to learn on the fly. This isn’t the only or the best way to accomplish our goal, but in the spirit of collaboration, ASU wanted to share our approach with others who might be tackling the same issue.
You can see the AWS services used in the step-by-step workflow diagram. Here’s a little more color on each step:
- Blackboard Caliper Event Stream
- Blackboard provides good documentation for configuring Caliper on the LMS admin side. The Blackboard team was quick to help with any questions we had
- Cloudfront
- Initially, we didn’t include this piece. However, it became necessary as a workaround
- The API Gateway uses Server Name Indication (SNI), and that wasn’t compatible with Blackboard’s end of the handshake. We used Cloudfront’s custom SSL client support to get around this issue
- API Gateway
- API Gateway acts as the reliable traffic cop for receiving the Caliper events from Blackboard
- Cloudwatch
- We put Cloudwatch in for monitoring and error handling. For example, we’ll want to put in alarms if there are wild swings in event counts from one day to the next
- Lambda
- Lambda is our serverless environment for running ETL code on the events. At first, we didn’t transform the data.For now, we do not transform the data. Eventually, though, we’ll want to shape the events to fit into our relational schema
- Kinesis Firehose
- This is our capacitor in the circuit. Events will build up as received and then load where we want and when we want. Given the variance in event volume/frequency, a cloud-based and elastic service like this is the right tool for the right job
- S3 & Redshift
- When Kinesis loads batches of events, it needs two things. First, it needs a staging place for the data to land (S3). Second, it executes SQL COPY commands to write the data to database tables in Redshift
This step-by-step process is fairly rudimentary. There are two additional parts of the ecosystem that get a little more complex:
- ETL jobs: When we transform the data (in Kinesis or when COPYing to Redshift), we need to write the code to move specific parts of each Caliper event to the appropriate table (and to join with existing data if necessary)
- LRS Schema: The data in Redshift needs to be joined in a logical manner that makes querying and analysis as efficient as possible
As we continue to develop this cloud-based LRS, we’ll try and share more about these detailed components. Until then, feel free to contact us with questions.