New on the Blog

Improving the Development Process with Metrics-Driven Insights

Read the five-part series on Medium:

Imhotep: Scalable, Efficient, and Fast. VP of Engineering Jack Humphrey kicks off a 5-part series about improving the development process with metrics-driven insights and Imhotep, our open source data analytics platform.

Using Metrics to Improve the Development Process (and Coach People). We use a measure-question-learn-improve cycle to refine development processes – We measure everything we can. We learn by asking questions and exploring the data we’ve collected. We use our learnings to try to improve. And we measure continuously to confirm improvement.

Metrics-Driven Process Improvement: A Case Study. Insights in practice: How we used Imhotep, our open source data analytics platform to improve translation verification.

What’s Up, ASF? Using Imhotep to Understand Project Activity. Jack uses ASF Jira data to demo the value of analyzing project activity and introduce Imhotep Builder Directory, our new open source Jira Actions builder.

The Benefits of Hindsight: A Metrics-Driven Approach to Coaching. Hindsight is our powerful internal tool we use to make contributions visible and drive coaching insights.


Featured Project: Indeed MPH

Indeed MPH (Minimal Perfect Hash Tables) provides an immutable key/value store with efficient space utilization and fast reads. To learn more, read this overview on our Indeed Engineering blog.

Random lookup latency in microseconds


Featured Project: Proctor

Proctor is an A/B testing framework written in Java that enables data-driven product design.

   
Proctor Indeed’s A/B testing framework written in Java
proctor-webapp Java web application that uses proctor-webapp-library to manipulate and view Proctor definitions
proctor-webapp-library Library used for running a web application to create and modify Proctor definitions
proctor-pipet Java web application that provides a simple REST API to Proctor
django-proctor Python package for your Django web app to use Proctor groups
proctor-sample-matrix Proctor test definition files and a Maven pom.xml that demonstrates how to use the Maven plugin’s generator to build them into a single JSON test matrix file
proctor-demo Reference implementation and demonstration of Proctor

Learn More

Proctor documentation


Featured Project: Imhotep

Imhotep is a highly scalable analytics platform that lets you do the following:

  • Perform fast, interactive, ad hoc queries and aggregate results for large datasets
  • Combine results from multiple time-series datasets
  • Build your own data tools for analysis, monitoring, reporting, and automated data processing on top of the Imhotep platform

The Imhotep suite includes the following components:

   
imhotep Core Imhotep code
iql Web interface for making IQL queries against an Imhotep cluster
iupload TSV uploader webapp for an Imhotep cluster
imhotep-tsv-converter Tool to convert TSV files into Flamdex indexes for Imhotep
imhotep-cloudformation Cloudformation scripts and other helper scripts for spinning up an Imhotep cluster in AWS

Learn More

Imhotep documentation


Other Projects

Java Utilities

   
util-core General Java utilities and helper classes
util-varexport Utility that enables you to expose runtime variables from a running Java application
util-urlparsing Utility to efficiently parse key value pairs from query strings in URLs; also includes fast number parsing and url decoding utilities
util-compress Utility for compressing and uncompressing data; includes snappy and gzip
util-io Utility for performing IO, including many interfaces implemented elsewhere
util-mmap Utility for performing mmap operations and direct memory access
util-serialization Utility for serializing and deserializing data to and from binary and string formats

Status

Status is a project to help report the current state of external systems that an application depends on, as well as the current health of any internal aspects of the application.

LSM Tree

LSM Tree is a fast key/value store that is efficient for high-volume random access reads and writes.

   
lsmtree-core Implementation of a log-structured merge tree
record-log Library for writing data to append-only record logs optimized for replication
recordcache Provides abstractions around writing record logs, building LSM trees, and performing LSM tree lookups