Casually contributing to FOSS

5 min readFeb 18, 2021

Hey there, Medium wanderer; and welcome to my Ted talk — “Casually contributing to FOSS”! 😛

Amidst all the sprint-driven office work, on-call issues and never-ending backlog, doing something extra like contributing to FOSS (and/or, even blogging about it) feels like a breath of fresh air. Especially because it breaks the monotonous routine, and gets your creative brain working on its own.

I’m going to share my experience with the opportunity that I very recently got, to casually contribute to open-source software.

Contributing to FOSS?

Okay okay, I’ll explain.

For the uninitiated, the textbook definition of FOSS (aka Free & Open-Source Software) is:

The umbrella term for software that is simultaneously considered both Free software and open-source software.

And that kids; is why we don’t refer to textbooks anymore

Just kidding. Practically, it refers to software which provides anyone to collaborate upon, and build awesome tools! Of course, each open-source project/repo has its own rules on re-distribution, modification, etc — make it a thumb-rule to checkout the LICENSE of any repository to understand more about it.

Casually?

Now we know what’s FOSS. But why contribute casually?

Well, typically, open-source contributions are regarded as a complex extra-work — complex, because you need to:

ensure code quality standards
ensure that none of your team/company-specific code gets committed
ensure that your change/solution is compatible with the repo, by asking the maintainers
ensure that the license reads alright
… and 101 other things you’ll learn over time after you keep contributing to OSS

But this is only supposed to make you familiar with the OSS realm and understand how communities have built such projects, that have been come to be used by such a huge number of developers!

It may be very easy and tempting to showcase these as reasons to postpone contributing your changes, or wait for someone else to give you dedicated time (aka, free-up your bandwidth) and ask you to contribute. Don’t.

Waiting for dedicated time to make OSS contributions be like

Be self-driven. Just go ahead. And contribute. Casually.

The window to my casual contribution

I currently work as a SE-2 with the Unified Ingestion Platform (UIP) at Intuit India. Our team is building self-serve data lake ingestion capabilities, for the entire company to leverage!

What this especially means is, we need to support ingestion from as many types of databases (sources/upstream) as there can be, and expose them for our analysts (consumers/downstream).

One of our well-supported type of ingestion is of DynamoDB sources. We recently got an AppendOnly use-case for a DynamoDB source, wherein just Insert events needed to be propagated; and not Updates/Deletes.

And to top it off, there was around 20TBs of deletes planned to flow within a couple of days — due to TTL, as well as a manual clean-up activity by the DynamoDB source team. Without this change, an inflow of 20TB of unnecessary events could clog the ingestion — or create a lag. The clock had started to tick.

Shout-out to all the AoT (Attack on Titan) fans, who understand this reference!

As I got assigned to this task and started getting context, I came to realize that we were using an open-sourced lambda repository: awslabs/lambda-streams-to-firehose, and this is also exactly the component that would need to be changed to incorporate the AppendOnly use-case. Eyy, there was my window to my casual OSS contribution!

awslabs/lambda-streams-to-firehose

Please note that this project is now deprecated for Kinesis Streams sources. For forwarding Kinesis Streams data to…

github.com

The closure to my casual contribution

Not gonna lie, I assumed that the repo would already have a feature like that built within it — and maybe I just needed to set a lambda environment variable to leverage it.

Nope. On jumping into the repo and reading the very-appropriately named README, I found out that all events just get written — there was no provision out-of-the-box to filter out specific event types!

I started digging into the code, to see where exactly the events were getting written. Maybe I could just check if their event type was Insert, before calling the write function:

Coding without an OSS mindset: you come up with a short-term non-generic solution!

And then it hit me. I should build this feature in such a way, in which I had expected the repo to support the feature out-of-the-box in the first place. Not a hardcoded event type check, but rather a configurable environment variable setting through which users can specify what all event types could be written.

In hindsight, these were the code changes that made it possible:

Adding a configurable-but-optional lambda environment variable

Coding with an OSS mindset: you come up with a generic solution, that solves the problem for more people!

We first made the change internally and rolled it out in e2e. After the functionality started working and had no impact on the lambda performance, I opened a GitHub issue in the original repo to check if the maintainer was welcome to this feature addition — or if they had a different approach in mind.

After getting the maintainer’s green signal, I opened this PR — making the code changes, as well as documenting the environment variable in the README. And 4 days later, voila — it got merged! 🎉

PS — Yes, I did have to check the license first. The awslabs/lambda-streams-to-firehose repo had a well-known Apache License 2.0 which is pretty permissive!

And as a final closure to this casual contribution, I also made sure to blog about it; so that more people get acquainted with the fun of making casual OSS contributions and spare some time for the same.

FIN.