Experimentation in Split: Make Your Events Work for You!

So, let me guess… you love feature delivery with Split, and you’ve heard about experimentation. Now you think you’re ready for the next step.

Awesome!

This post is filled with details about how you can get started, and how Split makes it easy! But maybe you have questions? Things like:

  • What is an event?
  • How does one extract events?
  • What if my events platform doesn’t support webhooks?
  • How does one transform events?
  • How are events sent to Split?
  • Is this really all worth it?

The good news is that if you’re a Split customer, you’re basically already on your way! (And, shameless plug, you can hear about all of the above at Flagship, March 16-17!)

Maybe you’ve heard, “a split is a feature flag, and an experiment rolled into one.” We say this because Split automatically keeps track of which features were given to each of your customers (we call them impressions). This alone can only tell you who is in the A group and who is in B (in your AB test, because that’s what a lot of experimentation is!). To have an experiment, you need data about what those users experienced and what they did. In other words, you need events.

There are events that describe things other than users, but in this post we’ll focus on the events of anonymous and logged-in users (other types can be handled similarly anyway).

The good news for many Split customers is that events are readily available.  If you’re lucky enough to use mParticle, Segment, or Google Analytics, you can stop reading this post and start reading our docs (at the links in this paragraph) to see how to leverage your existing events in Split. We’ll call that early graduation.

For the rest of you, keep reading and learn. My goal is to teach you the steps necessary to extract, transform, and load events to Split. I will draw on field integration code examples for Amplitude, MixPanel, Tealium, and Rudderstack. If you’re using one of those tools, most of the work has been done for you already. If you’re not, the blueprints I provide should help you build your own. Most of these technologies are available for review on Github: Amplitude, MixPanel, and Rudderstack.

What is an event?

Instead of describing events in the abstract, let me give you a series of examples.  Most tools pass a generic event with a track SDK call.  These examples are all JavaScript, denuded of their initialization so you can just see the event pass itself.  Witness the birth of an event.

// mParticle Example // Note the custom properties payload: flower and quantity. mParticle.logEvent( 'Flower Seen', mParticle.EventType.Transaction, {'flower':flower,'quantity':5} );
Code language: C# (cs)
// Segment Example // More custom property payload, a mix of strings and numbers. // Note the incorporation of userAgent. analytics.track("tracked_by_segment_dbm", { userAgent: navigator.userAgent, plan: "Pro Annual", accountType: "Google", total: 50.00, identityId: "red1234" });
Code language: PHP (php)
// Tealium Example // Note the array syntax. utag.link({ "tealium_event" : "product_view", "product_id" : ["12345"], "product_name" : [flower], "product_quantity" : ["2"], "product_unit_price" : ["12.99"] });
Code language: Arduino (arduino)
// Rudderstack Example // The callback is unique. // It’s useful for multiplexing events or event properties at their source. rudderanalytics.track( "split track event", { revenue: 42, currency: 'USD' , user_actual_id: user_id }, () => {console.log("in track call");} );
Code language: PHP (php)
// Split Example var properties = { region : ‘TX’, status : ‘silver’ }; client.track('processing_time_ms', 1024, properties);
Code language: C# (cs)

If you choose to use Split to report your events, you don’t have to finish the rest of the post. Why? Because Split events are directly transported, asynchronously and in batch, to the Split cloud. Why wouldn’t you just use Split? In most cases, the track calls to create events are already in place with one of the tools shown. Customer Data Platforms (CDPs) can do a cloud-to-cloud transfer of events to Split, removing any need to add Split tracking on top of the existing tracking. This post exists to liberate events!  

If you skimmed through the examples, you’d notice that every approach includes not just an event but some details about the event.  The details are often called properties, and they make the events much more useful when you go to do analytics.

The asterisk is because I mean almost all the properties.  Sometimes there are “bookkeeping details” that can be left behind.  Let’s see how event extraction works so we can have a mapping strategy.

How do you extract events?

In short, by webhook or stand-alone API extraction.  

A webhook is a function you host in your cloud. Google has Cloud Functions, and AWS has Lambdas; both are popular, but many other providers can host a webhook.

Many tools will let you specify headers to your webhook, so you can do things like configuring the Split environment and traffic type you want to use when you start sending traffic. Let’s look at a webhook to get a clearer understanding.

// Tealium Webhook import com.google.cloud.functions.HttpFunction; import com.google.cloud.functions.HttpRequest; import com.google.cloud.functions.HttpResponse; public class Tealium2Split implements HttpFunction { public void service(HttpRequest request, HttpResponse response) throws Exception { StringWriter writer = new StringWriter(); IOUtils.copy(request.getInputStream(), writer, StandardCharsets.UTF_8); String requestBody = writer.toString(); JSONObject eventObj = new JSONObject(requestBody); List<JSONObject> events = new LinkedList<JSONObject>(); events.add(eventObj); JSONArray splitEvents = new JSONArray(); for(JSONObject e : events) { // begin creating Split events ….
Code language: Java (java)

The webhook takes the request’s input stream and copies it into a string. The string is parsed into a JSON object (most webhooks provide input as JSON). The events are then looped through to produce corresponding events for Split. This example does not use streaming.

An AWS Lambda looks very similar to the Google Cloud Function.

// Split to MixPanel integration public class SplitImpressions2MixPanelEvents implements RequestStreamHandler { @Override public void handleRequest(InputStream input, OutputStream output, Context context) throws IOException { long start = System.currentTimeMillis(); String json = IOUtils.toString(input, Charset.forName("UTF-8")); // continue transformation to Split events
Code language: Java (java)

We have a handleRequest instead of a service method, but otherwise the picture is almost identical. The MixPanel example is a webhook you register with Split. Split also has webhooks for exporting impression and audit data to third parties (like MixPanel).

In both cases, the request’s input stream is fully read into a string before processing continues. A more sophisticated implementation would read the input and produce Split events as a stream.

// MixPanel to Split Event Integration // Note how it reads input and produces Split events as a stream BufferedReader reader = new BufferedReader(new InputStreamReader(byteStream)); String line = null; while((line = reader.readLine()) != null) { rawEvents.put(new JSONObject(line)); if(rawEvents.length() >= config.batchSize) { sendEventsToSplit(config, rawEvents); rawEvents = new JSONArray(); } }
Code language: Arduino (arduino)

In this example, the byteStream is consumed by a BufferedReader. Each line of input is a single event, making it convenient to parse them into JSONObject instances off the stream.  After a batchSize of events are consumed, they are sent to Split in-line (could have been sent on another thread).

Streaming is beneficial when the input sizes are large. If you are reading more than a megabyte, you should use streaming. The Amplitude and MixPanel event integrations stream.

What if my events platform doesn’t support webhooks?

Then it will almost certainly support a REST API you can use to extract data. You need to do a little more work to get the data than with a webhook, including using your own REST API library to call the data export API of your tool. Java has a built-in HTTP client, but there are popular third-party libraries. Languages like Node.js use built-in commands too.  

At the least, extraction APIs let you say what time period you want to export. Some allow you to specify very granular timestamps, and others give you one-day increments. If you want to extract high volumes of events, you should aim to run often, grabbing short periods each time. You can batch the events you create back to Split.

The MixPanel to Split integration is an excellent example of calling an API to retrieve events; the streaming code in the example above comes from that integration.

For an advanced example, consider Amplitude’s bulk events API. This one is trickier than most because it responds with a zipped archive of JSON events. You can see the full solution on Github. Pay attention to how the input is streamed to an iterator that decompresses and resolves into events.

How do you transform events?

Let’s look at a sample event tracked to Rudderstack.

// Rudderstack sample track event { "anonymousId": "dfbd9280-8727-42e1-a16f-6bf2fb092288", "channel": "web", "context": { "app": { "build": "1.0.0", "name": "RudderLabs JavaScript SDK", "namespace": "com.rudderlabs.javascript", "version": "1.1.13" }, "campaign": {}, "library": { "name": "RudderLabs JavaScript SDK", "version": "1.1.13" }, "locale": "en-US", "os": { "name": "", "version": "" }, "page": { "path": "/rudderstack-split.html", "referrer": "$direct", "referring_domain": "", "search": "", "title": "Rudderstack-Split Attributes Demo", "url": "http://localhost:8080/rudderstack-split.html" }, "screen": { "density": 2 }, "traits": { "email": "david.martin@split.io" }, "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36" }, "event": "split track event", "integrations": { "All": true }, "messageId": "d8e6fd88-a748-47f1-8c41-81b4d8cbfbf9", "originalTimestamp": "2021-03-10T22:47:44.629Z", "properties": { "currency": "USD", "revenue": 42, "user_actual_id": "c5a20f73-650f-49e7-85d4-22159ccb7448" }, "receivedAt": "2021-03-10T22:47:44.781Z", "rudderId": "c96999bf-a110-4cb6-9340-da31996539cd", "sentAt": "2021-03-10T22:47:44.629Z", "type": "track", "userId": "c5a20f73-650f-49e7-85d4-22159ccb7448" }
Code language: JSON / JSON with Comments (json)

Gold! Most of this event can be passed to Split in event properties. It must be flattened to do that, though.  So the rich context object in the Rudderstack event presents a challenge.

Also, timestamps are in UTC, and Split wants them in milliseconds since the epoch. We have the problem of deciding when to send the event with anonymousId as its Split key, or userId (or both). Overall, the mapping is clear. The left-hand side is what Split expects in each event. The right-hand side is the mapping to the Rudderstack property shown in the sample track event above.

What Split Expects in each eventMapping to the Rudderstock property in the sample track event above
keyuserId or anonymousId
trafficTypepulled from configuration (stand-alone) or HTTP header (webhook)
eventTypeIda cleaned version of event (Split allows [a-zA-Z0-9][-_\.a-zA-Z0-9]{0,62})
environmentNamealso pulled from configuration
timestamporiginalTimestamp converted to milliseconds since epoch
propertiessee below

How do we get all those properties recursively?  Recursively!

// Recursive flattening of properties into a master map private void putProperties(Map<String, Object> properties, String prefix, JSONObject obj) { for(String k : obj.keySet()) { if(obj.get(k) instanceof JSONArray) { JSONArray array = obj.getJSONArray(k); for(int j = 0; j < array.length(); j++) { putProperties(properties, prefix + k + ".", array.getJSONObject(j)); } } else if (obj.get(k) instanceof JSONObject) { JSONObject o = obj.getJSONObject(k); for(String key : o.keySet()) { if(o.get(key) instanceof JSONObject) { JSONObject d = (JSONObject) o.get(key); putProperties(properties, prefix + key + ".", d); } else { properties.put(prefix + k + "." + key, o.get(key)); } } } else { properties.put(prefix + k, obj.get(k)); } } }
Code language: Arduino (arduino)

In the above example, the context object has all of its object children flattened. Each level is separated by a period. The end result is a Split event with all the goodies preserved.

// Split event as received and transformed from Rudderstack { "environmentId": "194da2f0-3e22-11ea-ba75-12f2f63694e5", "environmentName": "Prod-Default", "eventTypeId": "split_track_event", "key": "8d90ef47-189b-4a55-af8a-5776188126f9", "properties": { "context.app.namespace": "com.rudderlabs.javascript", "context.os.version": "", "context.screen.density": "2", "channel": "web", "context.library.version": "1.1.13", "receivedAt": "2021-03-10T23:42:12.743Z", "type": "track", "context.library.name": "RudderLabs JavaScript SDK", "context.page.search": "", "context.page.url": "http://localhost:8080/rudderstack-split.html", "revenue": "42", "user_actual_id": "8d90ef47-189b-4a55-af8a-5776188126f9", "context.ip": "24.28.74.26", "currency": "USD", "context.page.path": "/rudderstack-split.html", "context.app.build": "1.0.0", "context.page.title": "Rudderstack-Split Attributes Demo", "context.os.name": "", "context.userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36", "sentAt": "2021-03-10T23:42:12.642Z", "context.app.name": "RudderLabs JavaScript SDK", "context.locale": "en-US", "context.traits.email": "david.martin@split.io", "context.page.referrer": "$direct", "context.app.version": "1.1.13", "context.page.referring_domain": "", "rudderId": "4da9fcd7-9f20-4ca4-842a-e7dea52f75c5" }, "receptionTimestamp": 1615419750458, "timestamp": 1615419732642, "trafficTypeId": "194c6a70-3e22-11ea-ba75-12f2f63694e5", "trafficTypeName": "user", "value": 0 }
Code language: JSON / JSON with Comments (json)

It isn’t JSON’s nature to alphabetize, but if you compare the original //Rudderstack track event and the //Split event as received and transformed from Rudderstack, you’ll discover that most of the source event’s properties have been preserved in the Split event.

How do you send events to Split?

RESTfully.  Consider this code.

// RESTfully POSTing events to Split public class CreateEvents { ... public void doPost(JSONArray events) throws Exception { CloseableHttpClient client = HttpClients.createDefault(); HttpPost httpPost = new HttpPost("https://events.split.io/api/events/bulk"); int i = 0; for( ; i < events.length();) { JSONArray batch = new JSONArray(); int j = i; for( ; j < i + batchSize && j < events.length(); j++) { batch.put(events.getJSONObject(j)); } System.out.println("INFO - sending events " + i + " -> " + j + " of " + events.length()); postToSplit(client, httpPost, batch); i += batchSize; Thread.sleep(1000); } client.close(); } private void postToSplit(CloseableHttpClient client, HttpPost httpPost, JSONArray batch) throws UnsupportedEncodingException, IOException, ClientProtocolException { StringEntity entity = new StringEntity(batch.toString(2), Charset.forName("UTF-8")); httpPost.setEntity(entity); httpPost.setHeader("Content-type", "application/json"); String authorizationHeader = "Bearer " + apiToken; httpPost.setHeader("Authorization", authorizationHeader); System.out.println("INFO - " + httpPost.toString()); CloseableHttpResponse response = client.execute(httpPost); System.out.println("INFO - POST to Split status code: " + response.getStatusLine()); if(response.getStatusLine().getStatusCode() >= 400) { System.err.println(batch.toString(2)); System.exit(1); } response.close(); } }
Code language: C# (cs)

This example shows batching events to Split’s events endpoint. You’re sending JSON, so your programming language of choice will have clever ways to put your events together for the POST body. Note that you’ll need a server-side API token to send an event.

Is it all worth it?

Yes, it’s worth it. 80% of features have a neutral or negative impact. We call this the “impact gap” at Split, and it’s a real thing.  

If you don’t measure, you won’t know if you’re a success. You could build countless features on top of one that never even resonated with your customer base in the first place. Not being able to calculate and guess the right answer isn’t a failure. Refusing to measure your results is a failure. Split is your golden ticket to measuring, getting yourself onto a sound track, and doubling down on only the innovations that matter.

Plus, you can get some incredible views of your features.

Split will study your organization’s metrics and keep you informed of how each of your features is having its impact.  Did a new feature result in a drop in signups?  Are those errors background noise, or are they coming from that 5% canary rollout you just kicked off?  Ever wish you could just hit a (kill) button to just stop that crazy world?

You’re already in the driver’s seat.  You may even have a nice dashboard.  Now just make sure you can look out the windshield and see where you’re going. Try Split today! And seriously, if you’ve read this far you know you want to hear how companies like Twilio, Speedway, Comcast, and Experian are running experimentation programs, and delivering business impact. To hear those stories and more, join us at Flagship!