Categories
Community News and Resources

The Art of Community: Why Developers Contribute to Vendor-Owned Open Source Projects

Open source software (OSS) development is deeply ingrained in the developer culture, representing a distinct and inclusive collaborative ecosystem. In this chapter, we will explore the motivations behind vendor-owned OSS contributions through the lenses of experience, global region, and the use of Stack Overflow. 

OSS projects represent the power of community: collaborative efforts to develop code and software which positively impacts a wider audience than the individuals involved. Vendor-owned OSSprojects, e.g. TensorFlow and Visual Studio Code, combine this sense of community with financial backing from the world’s largest tech companies – a powerful combination of stability and open cooperation. 

For every developer involved in vendor-owned OSS, there is a different motivating factor – why do developers contribute to these projects? The big picture is that the top-three motivators for vendor-owned OSS contributors are: wanting to learn how to code better (38%), to improve the software that they use (29%),and to contribute to something bigger than themselves (22%).

How does experience affect vendor-owned OSS contribution?

When compared to beginners, those with six years of experience or more are around 13 percentage points more likely to contribute in order to improve a software they use. These experienced and improvement-focussed developers are also much more likely to hold specialist roles compared to their less experienced peers. For example, they are six times more likely tobe software architects and five times more likely to be either tech/engineering team leads or site reliability engineers. They not only believe that the software they use can be improved, but that they also have the capability and skills to improve it.

Experienced developers devote significant attention to enhancing the open-source software (OSS) provided by vendors, which they actively use and rely on

In fact, improving software seems to be the main motivation for many senior developers – those with 16 years of experience or more are the least likely to contribute for the majority of the other reasons we list. Learning to code better, getting noticed by their company, and getting their code reviewed are much lower priorities among seasoned developers. This is to be expected given the amount of expertise and recognition they have typically accumulated by that stage of their career.

At the other end of the scale, those most willing to contribute for their own education are developers with 1-2 years of experience. Compared to those with even less experience,these developers are 58% more likely to be exclusively professionals and 48% less likely to be exclusively students. In other words, at this stage of their careers, they have enough professional know-how and confidence to contribute to vendor-owned OSS software – yet are pursuing further education for their coding skills by giving back to the community.

Vendor-owned OSS contribution around the world

According to our data, 73% of developers contribute to vendor-owned OSS globally, but the level of contribution varies around the world. Developers in South Asia are the most likely to contribute (85%), while those in Eastern Europe are the least likely (67%). As for the two largest regional developer communities, North America and Western Europe,78% and 70% of developers contribute to corporate OSS projects, respectively.

South Asia and the Middle East and Africa are hotspots for developers contributing to vendor – owned OSS projects in order to level up their coding skills

As for specific motivations, there are a couple of hotspot regions that stand out from the crowd. Nearly half (47%) of OSS contributors in the Middle East and Africa and SouthAsia are motivated by learning to code better and similarly, about one in four by the opportunity to have their code reviewed by more experienced colleagues: 10 and 5 percentage points above the global average, respectively. 

Tying in with our previous analysis: these regions also hold the two largest shares of developers with less than two years of experience – 52% for the Middle East and Africa and 73% for South Asia.

However, to see how motivations towards vendor-owned OSS change across the globe, we take a wider perspective. In doing so, we group motivations into three broad categories: individual-focussed (getting noticed by the company, learning to code better, etc), collaboration-focussed (getting their code reviewed by knowledgeable people, etc.), and business-focussed (building community support around a corporate open source software project). In this manner, we can get a view of how sentiments towards vendor-owned OSS change around the world.

For instance, we see that developers in Oceania are at least 5 percentage points more likely than any other region to have business-focussed motivations when contributing to vendor-owned OSS projects. This may be linked to the financial success/focus of developers in this region – 9% of OSS contributors in Oceania report that they or their organisation generate more than $1M of revenue every month on average,compared to the global average of 4%.

Female developers are considerably more likely to be business-focussed when contributing to vendor-owned OSS

An interesting note on gender: we see that globally, female developers are 26% more likely than male developers to be business-focussed in their approach to vendor-owned OSS contribution. This observation is particularly strong in Europe: 54%of female developers in Western and Eastern Europe are business-focussed, compared to 33% of male developers. However, as the proportion of OSS-contributing female developers (22%) is only slightly higher than the global proportion (21%), it’s unlikely that they drive business-focussed regional behaviour. 

How do OSS contributors useStack Overflow?

Let’s look at the usage of a website that is synonymous with cooperation in programming and software development and see how the proportion of OSS contributors changes with varying levels of interaction. For users of Stack Overflow, we see a behavioural trend–those who are more active on the website are more likely to contribute to vendor-owned OSS.

Diving into the specific usage patterns of Stack Overflow,those who don’t use or visit the site are the least likely to contribute to vendor-owned OSS for any reason, compared to those who use the site at any level. This is again related to experience: 39% of those who don’t use Stack Overflow havel ess than a year of software development experience and only 5% have an account with a badge; these developers are the least likely to contribute to vendor-owned OSS projects, after those with more than 16 years of experience.

Likewise, there are differences in motivations to contribute to vendor-owned OSS between those with or without StackOverflow badges. For example, only 28% of OSS-contributing developers without a badge want to improve the software they use, in contrast to 40% of developers with badges. A possible driver here is professional status – 74% of those without a badge are professionals. For those with a badge, 91% are professionals: these developers are not only more focussed on improvement, they are more willing to engage with the community to do so.

The strength of community shines through in vendor-owned OSS projects, where collaborative efforts to develop software have the remarkable ability to create positive impacts on a broader audience beyond the individuals directly involved. Here, we’ve shown that developers involved in vendor-owned OSS have different motivations depending on their experience, gender, and region, which in turn reflects how they use collaborative environments like Stack Overflow. 

Categories
Community Interviews

Interview: What is it like working on open-source game development?

I had a chance to speak with Liam Arbuckle, the acting CTO of the game/web development studio/collective (100% open-source) called Signal Kinetics. Liam is based in Australia. 

What is it that you’re working on?

Right now, we’re working on a citizen science game engine (sort of like Project Discovery in Eve Online, but integrating other games as well). We’re aiming to increase science discovery/contribution for everyone through gaming by allowing people/players to: 

1. Contribute to real-world scientific problems/experiments  

2. Help train ml/dl datasets/algorithms (sometimes through their actions in-game) 

3. Engage with users, especially those in the scientific community (we’re working on a service called Arcadia which is basically a fork of Buddypress that will implement features similar to services like Steam & Facebook Games) 

So you are targeting citizen scientists? Is there a particular age range you are targeting?

I believe information should be free, when I was younger, scientific journal access was expensive, also, there is a lack of engagement with the science community in Australia. I want to create something that can’t restrict a person from the science community due to their age, gender, spending ability etc.

What inspired you to create your Game Engine?

I attended Science hackathons, science and gaming, made mars rover, most recently I contributed to the Open Source Rover by NASA’s Jet Propulsion Laboratory.

What else are you working on?

We’re also working on our own game (with potential partnerships with Savy Soda, as an example, in the pipeline) and hoping to make our experience with Arcadia modular:

 1. Users can contribute to scientific research through playing any game in the world by installing a custom add-on designed by the Arcadia developers for the game.

 2. Users can have a “bank” (similar to the Pokemon Home system) that shows their games library, achievements and item list/screenshots in the Arcadia web app.

There’s no real large gaming community to play games online. I want to build a community, where gamers can share screenshots, there’s an overlay to watch people playing games, I want to make mini-games too, there’s no real limit. I used to game on a Samsung phone which had a play status, I could stream to Discord. I want to expand this idea to non-Samsung games, add a community with no limits – basically information freedom, with no blockage or limits.   

3. Users can choose which games to play.

What are your immediate goals?

Get industry connections.

What type of connections are you looking for?

I’ve made contact with Melbourne-based game companies so I’m on track with that. I’m looking for grants, an investment to work on the blockchain element, get connected with a marketing team, get a few 1,000 players to start off with, and then connect with more on social media.

Right now I don’t have the money to finance, we’ve had people come online to help with the open-source. I’d ideally like to get some consistent engagement rather than have contributors that do the occasional work.

If it wasn’t for Covid-19  I would have moved out of Australia, there are huge problems with setting up in Australia, no grants, no infrastructure for tech companies.

I want to start contributing to established games and engines to gain experience, connections, contribute and potentially expand my team’s vision.  

Are there any particular games that you have in mind?

Minecraft, l would love to contribute to that, I Love that you can make mods. I think Minecaft is crying out for more integrations, so I would love to get connections with Mojang. I’ll take any company that has a level of open-source ethos.

Continue working on the game, however, this requires money. A lot. And I’m not rich! I’m primarily focusing on a media kit that will later be used as the basis for a Kickstarter campaign.   

When do you plan to run the Kickstarter campaign?

I won’t have the game finished before the campaign starts, I want to put together a media kit, assets, I’m going to an incubator to learn how to market the game, understand which social media do we target, and which niche users. I have been involved with other Kickstarter projects and know I can’t be too broad with who I target at first. I think in  3-4 months we will be ready to launch the Kickstarter campaign.

I’ve got a team of about ~20-30 people (with most being external/outside collaborators, there are around 10 people that run the show and contribute on a consistent basis). These people have varying levels of experience in game development, design, and web app construction (among other things).

Are you actively looking for more contributors? If so, what level of experience are you looking for?

I’ll take anything, I won’t say no to anyone, I find that the science community say no, if we say no, we’re just defeating the purpose of the project.

We would prioritise people who have c#, and website building experience. Once you get your base established, then start with junior developers. We don’t want to be too closed, but also we don’t want to be too open and not get work completed. 

We are also working on a partnership with the Swedish Power Metal band Veonity to contribute with us on officially licensed songs for our games and the Arcadia platform  – recording is due to start in July which is very exciting!


Did you know that 34% of game developers use C#?

Interview with Liam Arbuckle

When did your interest in development start?

I love Star Wars, at 12 I went into robotics, and in 2016-2017 I worked to build a physical R2D2. In year 10 I started a computer science class at school. Unfortunately, computer science investment in schools is poor, but I had a good teacher that encouraged younger students who were not yet at the age to attend a class to learn in their breaks. I learned Python, and in year 11 I started working on GitHub, learned Ruby on Rails, Gem. 

I ended year 11 and decided I wanted to start developing. There are no astrophysics courses near to me. You can build games and tell stories from computer science.

How do you make decisions when it comes to your next self-improvement step? Do you look at data, attend conferences?

I attended the recent Atlassian conference. Also, there are 20 of us that meet at a bar regularly to talk about problems, I have joined a few teams and am developing professional skills. 

I pitched to investors last year and got 10,000 AUD but it doesn’t last very long in a startup.

I like to see people in the physical world, go to Python global conferences, learning what’s the newest feature with the project that I can use to my advantage.

Has it been a benefit to have online conferences due to Covid-19?

I would never have been able to afford travel to conferences until this year when I’ve started making money, the online conferences are more accessible.

Before, if you are not fully embedded in a developer community, there is not much incentive to go to in-person conferences, there is a huge cost to fly overseas for a conference, and no guarantee that project of interest will be discussed, no guarantee people that people will help you there. There are more frequent conferences now, by more teams, not just big companies doing them.

Do you have a mentor? Or are you mentoring someone else?

I’m a mentor at the University Codjo, mentoring 14-15-year-olds with Autism / ADSD. For me, the computer sciences teacher was a mentor at school, but I don’t have anyone mentoring me right now. I wouldn’t need a mentor right now for teaching me, rather someone who can structure how I do things, I’m not the best, I’m not perfect, people with experience have given great advice to me.

Do you have any words of wisdom for others thinking of building their own games or game engines?

1. I echo the words of “information wants to be free” if everyone open sources and has no barriers, that would be my ideal world!

2. If you want to make any media, games are great, they engage people, I lose interest in reading novels,  in games, there is so much you can involve other people with, everyone can make their own stories. There’s engagement.

What’s in your toolbox?

  • Unity for most of my games stuff
  • Starship, customisable prompt for my terminal – makes everything look so much cooler. I love customising my devices.
  • GitHub
  • Keybase for communications, encryption and there are git integrations.
  • Notion 
  • Visual studio code 
  • Jira by Atlassian – more of an industry-standard than what I was using before.
  • MacBook M1 for on-the-go stuff, I duel boot with Linux when testing.

How do you work as a distributed team? What tools do you use?

Keybase is the main tool, git commits can be seen in there and there are cool bots and tools you can use. It was also acquired by Zoom which shows that things will be great for global teams.

We also use Facebook messenger or WhatsApp for casual talk.

Git commits can be sent there, cool bots, and tools you can use. Was acquired by zoom, shows that things will be great for global teams.

What do you need right now?

Right now direct partnership with companies is needed, funding is so important. Everyone in the team is paying out of their own pockets. The best way we can succeed is with funding so the Kickstarter will work, with partnerships, it will give our Kickstarter legitimacy. 

If you’re interested in joining forces with Liam and his team either as a developer committed to open-source, or a partner, you can reach Liam via his GitHub profile.

We love to hear your development stories, get in touch to share yours.

Categories
Analysis

What do developers value in open source?

Open-source software (OSS) is used by 92% of developers, so what exactly do they value in it? We find that developers value OSS’s ability to supersede any single contributor and live on almost eternally. We highlight some uncertainty around OSS’s future by showing trends from geographic regions and sectors. The findings shared in this post are based on the Developer Economics survey 19th edition which ran during June-August 2020 and reached more than 17,000 developers in 159 countries.

What exactly do developers value in open-source?

Open-source software (OSS) is ubiquitous in the global developer community. As our data shows, OSS is used by 92% of developers. A question that comes to mind is: what exactly do developers value in OSS? In the chart below, we show which statements developers value about OSS, broken down by professional and nonprofessional developers, and enterprise and non-enterprise developers. The overarching theme for what developers value from OSS is its ability to be eternal. “To collaborate with the community, building software that outlasts even its originator” encapsulates the two statements with the greatest agreement.

The overall cost and wanting to avoid vendor lock-in/lock-out are important aspects that professional and enterprise developers in particular value in OSS, while non-enterprise developers value forking product derivatives and debugging more than the other groups. Non-professional developers do not value the overall costs element, perhaps because they have not experienced the costs involved in closed source software, whereas many professional developers have. Another aspect that non-professional developers value significantly less is avoiding vendor lock-in. This also suggests that these developers have not experienced the limitations of closed source software yet.

Appreciation of the overall costs of OSS is also highly linked with years of developer experience: only 24% of developers with less than one year of experience agree that low cost is an asset of OSS. In contrast, the percentage of developers who agree that low cost is an asset of OSS rises to 34% of developers who have between three and five years, and 43% of developers with six or more years of experience. Typically, as developers gain experience, they begin to work in different sectors, often crossing over between sectors. At this point, the flexibility that OSS offers may become crucial. 

Finally, we also see a greater proportion of non-professional developers not using OSS compared to others. This is also reflected indirectly in each of the other statements; we see that non-professional developers agree with every statement less than professional developers. This suggests that, to be truly appreciative of the benefits of OSS, you may have had to engage with it seriously, in the way professional developers do.

Where OSS is written is changing

At present, the culture of OSS is particularly strong with Western European and Israeli developers, where not a single statement is valued below the average. On the contrary, developers in North America—who, up until now, have driven the OSS movement—value contributing and interacting with the community less than average. This could suggest a cooling off of North American OSS development and a maturing of this ecosystem. 

On average, East Asian developers seem to be disengaged from the OSS movement more than developers from other regions. Only 88% of developers in this region use OSS compared to 92% globally. In general, developers in this region also value less aspects of OSS. In particular, their extremely low appreciation of the continuous support for the technology compared to others, highlights that developers in this region are apprehensive about the longevity of OSS, which partially undermines its main benefit. This apprehension is also reflected by the relatively low agreement associated with contributing. 

According to our data, South Asian developers value contributing to OSS significantly more than others. In addition, South Asia is the region with the largest proportion of developers who value collaborating and interacting with the community. This combination positions the region to be among the drivers of the next wave of OSS development. In the Middle East and Africa region, some key advantages of OSS, such as avoiding vendor lock-in and the overall low cost have not yet resonated with developers — this is despite the fact that, at least for Africa, income per capita is low compared to global averages. What assists in explaining this is this region’s proportion of professional developers and the experience of its developers. 

The Middle East and Africa, as well as South America, have roughly the same proportion of professional developers, 60.7%, in contrast to North America or Western Europe and Israel, where more than 80% of developers are professional. Non-professionals value OSS less. Similarly, developers in the Middle East and Africa are also the least experienced, on average, and years of experience in particular is linked with appreciating the low cost of OSS.

Some sectors embrace OSS while others don’t

Emergent sectors such as augmented reality (AR) and virtual reality (VR) stand to benefit greatly from OSS as a means of defining a common standard and exchanging ideas. Yet, we find that developers working in these two fields do not value forking/creating product derivatives, nor even collaboration in the case of VR, as much as other developers do, on average, from other fields. This could be partially explained by the lower than average agreement with the need for continuous support for a technology. When developers do not value this characteristic, it is unlikely that they are working with the mindset which would ensure long term OSS growth and desirability. 

On the other hand, developers who are building apps and extensions for third party ecosystems, on average, value contributing and forking more than developers in other sectors. Similarly, the very successful node.js runtime has facilitated other extensions and developers working in backend services really value the continuous support of OSS projects. At present, despite the large percentage of developers who use open source software, it is only in certain circumstances that the majority of developers value OSS for any given reason. Perhaps this suggests that OSS has become an expectation rather than being perceived as a gift from society at large to society at large. Observing how developers value OSS in the future would be a good litmus test for the health of open source projects. For now though, there are encouraging blooms in South Asia for example, but also software sectors of scepticism, such as in AR/VR.

Are you involved in open source? Share your experiences with us in our Developer Economics 20th edition survey!


Be a guest writer on our blog
Have you got brilliant tips and resources that developers love to read? Then we want you on our blog! Find out more.

Categories
Tips

Top Companies Contributing to Open Source

Who are the top companies contributing to open source? This blog post looks at how CodersRank used publicly shared data to answer this question, and how they created a series of data visualization videos.

The boom of open source software brought a change in technology that shaped the world as we know it today.

Open source exists thanks to the hard work of dedicated programmers and developers. It has become the foundation of cloud computing, software-as-service, next generation databases, mobile devices, the consumer internet, and more. 

undraw open source
Source: undraw.co

We, at CodersRank, are great admirers of open source. Almost every one of us contributes to open source regularly, and we sometimes work on a project or two together. 

In this blog post, we’ll introduce you to a video that we created. This video gives you a visual representation of companies that contributed the most to open source since 2012. If you find this data interesting, then you’d probably love to know the methodology behind it. We’ll show you exactly how we gathered the data and then how we gave it a visual spin. 

Video: Top Companies Contributing to Open Source

The video, “Top Companies Contributing to Open Source | 2012-2019”, is part of a series of data visualization videos that we came up with at the end of 2019. We made these out of curiosity, after realizing that you could see certain trends forming if you put together some of the publicly shared data. 

Haven’t seen the video? Here it is:

Behind-the-scenes of our method

This will be a quick overview of our method – please see the actual code used further down this page.

Measuring contributions

In measuring the contributions we only considered the commits. We know that there are many other ways to contribute to a project not just commits but in this particular case we wanted to focus on the commits.

Defining contributing authors

We relied on the email addresses of the authors. The second part of the email is usually the company’s domain.

Assigning commits to a company

There are around 2.4B public commits in GitHub (since 2011) and we had to analyze each and every one of them to answer this question. Thankfully, not manually!

Using the GitHub API to extract that amount of data would have been impossible. Thanks to the GitHub Archive Project, all the public GitHub events are stored in a publicly available BigQuery database. Using SQL to extract data made the process easy and painless.

Cleaning up the data

After we counted the commits for each company, the data needed to be cleaned. First, we excluded email providers like gmail, hotmail, yandex etc. Then, we excluded a few more, as there were some cases where the commits were made by bots.

Converting the results to the expected format

We used Flourish to create the videos. The data had to be converted into a format that is acceptable for Flourish (i.e.: handling months with no data from a given company).

Implementation

Step 1: get the commits/domains

Top companies contributing to open source - PushEvent in BigQuery-  get the commits/domains

The payload column is what we needed here, since it contained the email address. In our example it is a454492e42fd9810e577ebee548c7e59bd883bca@live.com.au. GitHub hashed the first part of the email, but we didn’t need that anyway, because we were only curious about the domain-level information. 

The query to count the commits/domain name looked like this:

## pre-2015 API
CREATE TEMP FUNCTION
 json2array(json STRING)
 RETURNS ARRAY<STRING>
 LANGUAGE js AS """
         return JSON.parse(json).map(x=>JSON.stringify(x));
       """;
WITH
 export_domains AS(
 SELECT
   DATE_TRUNC(DATE(created_at), month) AS month,
   emails,
   ARRAY(
   SELECT
     REGEXP_EXTRACT(x, "@(.*)")
   FROM
     UNNEST(emails) x
   WHERE
     REGEXP_EXTRACT(x, "@(.*)") IS NOT NULL) AS domains
 FROM (
   SELECT
     * EXCEPT(array_commits),
     ARRAY(
     SELECT
       JSON_EXTRACT_SCALAR(x,
         '$[1]')
     FROM
       UNNEST(array_commits) x) emails
   FROM (
     SELECT
       created_at,
       json2array(JSON_EXTRACT(payload,
           '$.shas')) array_commits
     FROM
       `githubarchive.day.20130101`
     WHERE
       type='PushEvent' )))
SELECT
 month,
 flattened_domains AS email_domain,
 COUNT(flattened_domains) AS domain_count
FROM (
 SELECT
   month,
   flattened_domains
 FROM
   export_domains
 CROSS JOIN
   UNNEST(export_domains.domains) AS flattened_domains )
GROUP BY
 month,
 email_domain
ORDER BY
 month,
 domain_count DESC

After 2015, the format of the payload changed a bit and required a slightly different query:

## post-2015 API
CREATE TEMP FUNCTION
 json2array(json STRING)
 RETURNS ARRAY<STRING>
 LANGUAGE js AS """
         return JSON.parse(json).map(x=>JSON.stringify(x));
       """;
WITH
 export_domains AS(
 SELECT
   DATE_TRUNC(DATE(created_at), month) AS month,
   emails,
   ARRAY(
   SELECT
     REGEXP_EXTRACT(x, "@(.*)")
   FROM
     UNNEST(emails) x
   WHERE
     REGEXP_EXTRACT(x, "@(.*)") IS NOT NULL) AS domains
 FROM (
   SELECT
     * EXCEPT(array_commits),
     ARRAY(
     SELECT
       JSON_EXTRACT_SCALAR(x,
         '$.author.email')
     FROM
       UNNEST(array_commits) x) emails
   FROM (
     SELECT
       created_at,
       json2array(JSON_EXTRACT(payload,
           '$.commits')) array_commits
     FROM
       `githubarchive.day.20150102`
     WHERE
       type='PushEvent' )))
SELECT
 month,
 flattened_domains AS email_domain,
 COUNT(flattened_domains) AS domain_count
FROM (
 SELECT
   month,
   flattened_domains
 FROM
   export_domains
 CROSS JOIN
   UNNEST(export_domains.domains) AS flattened_domains )
GROUP BY
 month,
 email_domain
ORDER BY
 month,
 domain_count DESC

The result looked like this:


Row
month email_domain domain_count
1 2015-01-01 gmail.com 131357
2 2015-01-01 users.noreply.github.com 8802
3 2015-01-01 python.org5786
4 2015-01-01 hotmail.com4942
5 2015-01-01 fhda.edu 3888
6 2015-01-01 yahoo.com3216
7 2015-01-01 etudes.org 2736
8 2015-01-01 qq.com 1955
9 2015-01-01 sly.mn 1908
10 2015-01-01 foothill.edu 1848

Step 2: exclude email providers

The heavy lifting was done by BigQuery. We exported the results into a .csv file and used the good old Jupyter Notebooks to clean up the data.

As you can see in the example result, not surprisingly, the first one was gmail.com. Our next task was to remove the email providers from the list. 

We used a GitHub contribution of the most popular email domains for the cleanup: https://gist.github.com/tbrianjones/5992856/

And we also added some other blacklisted domains (excluded_domains.txt):

.(none)

91177308-0d34-0410-b5e6-96231b3b80d8

samo-laptop.(none)

dd0e9695-b195-4be7-bd10-2dea1a65a6b6

ubuntu.(none)

b8fc166d-592f-0410-95f2-cb63ce0dd405

b9a71923-0436-4b27-9f14-aed3839534dd

b2dd03c8-39d4-4d8f-98ff-823fe69b080e

0b4bb1d4-4e5a-0410-9cc4-b2b747904278

709f56b5-9817-0410-a4d7-c38de5d9e867

iki.fi

Gmail.com

none

example.com

1a063a9b-81f0-0310-95a4-ce76da25c4cd

localhost.localdomain

localhost

localhost.(none)

home

b8457f37-d9ea-0310-8a92-e5e31aec5664

li7-202.members.linode.com

g

users.noreply.github.com

us.door43.org

mailinator.com

smullindesign.com

review.openstack.org

nyarlabo.com

boston.com

li.gugod.org

niob.xnis.de

sly.mn

kazer.org

recoil.org

tsaousis.gr

rituwall.com

cbrese.com

renovateapp.com

scrapers.everypolitician.org

Step-by-step code walkthrough

Load BigQuery results

from tqdm.notebook import tqdm
 
import pandas as pd
import numpy as np
 
tqdm.pandas()
 
df = pd.read_csv("./email_domains_large.csv")

Merge the list of domains we want to exclude:

free_providers = list()
with open("./free_email_provider_domains.txt", "r") as f:
    for line in f.readlines():
        free_providers.append(line.strip())
excluded_emails = list()
with open("./excluded_domains.txt", "r") as fe:
    for line in fe.readlines():
        excluded_emails.append(line.strip())
free_providers = free_providers + excluded_emails

Add a new column to the dataset, whether the domain is a free email providers’ domain

df["free"] = df["email_domain"].progress_apply(lambda x: x in free_providers)

Create a list without the email providers

df_filtered = df[(~df["free"])].copy()

Add a row counter and limit the data to those domains that appear at least once among the top 30. This will make the final dataframe smaller and easier to handle.

rn = list()
for _, df_tmp in df_filtered.groupby("month"):
    t = list(range(df_tmp.shape[0]))
    rn += t
df_filtered["rn"] = rn
domains = np.unique(df_filtered[df_filtered["rn"] <= 30]["email_domain"])

Step #3: format data

As it was mentioned before, we used Flourish to create the video. In some cases there were empty months (the company didn’t have any commits) and Flourish expected the columns to be months not companies. So we had to make this transformation too.

Step-by-step code walkthrough

df_final = df_filtered[df_filtered["email_domain"].apply(lambda x: x in domains)].copy()
df_final["month"] = pd.to_datetime(df_final["month"])
date_range = pd.date_range(np.min(df_final["month"]), np.max(df_final["month"]), freq="MS")
 
temp_df_list = list()
for _, repo_data in df_final.groupby("email_domain"):
 
    df_temp = pd.DataFrame()
    df_temp["month"] = date_range
    df_temp = df_temp.merge(repo_data, on="month", how="left")
    df_temp.fillna(method="ffill", inplace=True)
    df_temp["email_domain"].fillna(method="bfill", inplace=True)
    df_temp.fillna(0, inplace=True)
    temp_df_list.append(df_temp)
 
df_full_data = pd.concat(temp_df_list, ignore_index=True).sort_values(["month", "domain_count_rolling"], ascending=[True, False])

df_chart_race_final = pd.DataFrame()
df_chart_race_final["email_domain"] = list(df_full_data["email_domain"].unique())
 
for current_month, monthly_data in df_full_data.groupby("month"):
    month_name = current_month.strftime("%Y-%m") 
    
    df_temp = df_full_data[["month", "email_domain", "domain_count_rolling"]].query("month == @current_month")
    df_chart_race_final = df_chart_race_final.merge(
                df_temp.drop("month", axis=1).rename(index=int, 
                                             columns={"domain_count_rolling": month_name}), 
                on="email_domain", how="left")

The full notebook can be found here.

Final word

More and more companies are recognizing the importance of open source software development and are committed to support it. 

We hope that visualizing just a slice of the data that these amazing men and women generate round the clock is a way to acknowledge their hard work. Thanks to them and the millions of hours they invested to build open source products, we get to use our everyday apps and software seamlessly. 

Thank a developer today!

About

Adrienn Tordai
Growth Marketer / Pizza Enthusiast @CodersRank. I love the Blue Jays, books, and The Office. Tell Elon I said hi. Always waiting for a Steam sale. | CodersRank: Our goal is supporting DEVELOPERS’ growth by their always up to date, professional CodersRank profile

Categories
Community Tips

Infographic: Who is behind open-source software?

In our 18th survey wave, we’ve asked developers whether they contribute to open-source software, and if so, why? In this post, we’ll explore who the contributors to open-source software are, their reasons for contributing, and finally what open-source support they expect from companies.

Open-source contributors tend to be younger than non-contributors.

More than a third (33%) of developers who contribute to open-source software are less than 24 years old as compared to 26% of non-contributors. This is not to say that they are inexperienced programmers; 41% of open-source contributors have 1 to 5 years of experience, 4 percentage points higher than non-contributors.

Contrary to what one might think, open-source contributors are not necessarily professionals. In fact, they are equally likely to be amateurs than non-contributors. You don’t have to be working professionally in the software industry to be involved and contribute to open-source software development.

Open-source software

Open-source contributors are more likely to be involved in multiple development areas than non-contributors. However, open-source contributors are significantly more likely to be involved in emerging sectors such as machine learning/AI and AR/VR, where innovations are mostly driven by open-source tools.

Finally, as you’d expect, developers’ likelihood of contributing to open-source software is also reflected in their activity on the most popular open-source hosting site, Github. The correlation is clear. Two-thirds of developers who don’t contribute (67%) have no personal public repositories on Github, whereas close to half of the contributors (48%) have two or more public repositories. We observe a somewhat similar relationship with Stack Overflow. Non-contributors are significantly more likely to not use the Q&A site at all or visit the site but not have an account. On the other hand, open-source contributors are twice as likely as developers who don’t contribute to have earned at least one badge (30% vs 15%). Working on open-source projects encourages developers to actively engage with their peers on Q&A sites. We’ve seen which developers contribute to open-source software projects. Let’s now dive into the reasons for contributing.

Why contribute to open-source software

Developers are most motivated to contribute to open-source projects to improve coding skills (29%) and a belief in the benefits of open-source (26%). What’s more, 22% of developers contribute to open-source software because it’s fun or to solve an issue with an existing open-source software project such as fixing a bug or creating a new feature.

By contrast, financial compensation is the least important motivation. Only 3% of developers are getting paid for their work on open-source projects. As it turns out, developers are more likely to get involved in open-source projects to build their reputation (14%) or to network (11%) rather than for direct financial gain. Furthermore, developers who get paid to contribute are almost 20 percentage points less likely to think it’s fun than those who contribute for other reasons. They are also significantly less likely to believe in open-source as a source of freedom, as an ideological imperative. 

Typically developers don’t contribute to open-source for a single reason but are motivated by multiple factors. For example, half of the developers who contribute to open-source for improving their coding skills also think it’s fun. 56% of contributors who want to network also feel like it makes them belong somewhere.

What developers expect from companies

In our Q4 2019 Developer Economics survey, we also asked developers what open-source support they expect from companies. Thirty-three percent of developers not contributing to open-source don’t expect anything from companies, as compared to 15% among open-source contributors. That said, two-thirds of non-contributors still think that companies should be involved and provide support to the open-source software movement; they realise how important open-source is and believe that companies should be a part of it.

On the other hand, 44% of open-source contributors expect companies to support and contribute to open-source communities. This increases to 55% for developers who contribute to solve an issue. Many contributors (44%) expect full documentation on how to use open-source software on companies’ products or services. This is especially important to developers who get paid for their work (53%).

Interestingly, open-source developers do not necessarily expect companies to build products and services upon open-source software (39%). This is the least important vendor expectation from developers in terms of support for open-source software.

Open-source software contributors are a diverse group of people. Their motivations to contribute range from learning, having fun, solving issues to building relationships and reputations. In summary, developers have plenty of reasons to contribute to open-source, and they expect companies to support them along the way. 

If you are involved in open-source and want to share your views, visit our latest survey and help shape the trends.

Categories
Tools

What have developers been reading

While novel readers were busy paging through murder mysteries and historical fiction this past spring, developer interests were data and analytics, Jakarta, cloud-native articles, Kubernetes and open source.

That’s what we discovered when we took a look at quarter-over-quarter pageview trends in DZone.com. A little background: 29 million unique readers visit each year. The research is based on article tags assigned by DZone editors and used to help readers search once they’re on our site. They aren’t keywords.

Consuming All Things Data in the Data Category

Our first pass researching tags occurred in the first quarter, when interest in data analytics and tools articles skyrocketed. Same situation for Q2 vs Q1. Our findings did show that readership shifted away from the data scientist tagline toward specific tools and data strategies that anyone can implement.

Q2TopicTags_2

Of the fastest-growing data analysis topics, the data analysis tools tag grew over 30X, and related topics like ingesting data (the collection of data into/out of the database for immediate use) and augmented analytics (machine learning-powered data analysis) grew about 10X more popular with readers.

Terms like ingesting data and augmented analytics speak to the need for more than just a dashboard approach to consuming data. Tim Spann, a Big Data thought leader and field engineer at Cloudera, thinks a consolidation in analytics is underway.

“I think there’s going to be consolidation. And a lot of startups are going to try to integrate a couple of these things together. They’re going to try to add more features and differentiate themselves. You’re going to see more of the data analytics tools try to do ingest and vice versa, (so) they’ll be a more interactive platform,” he explains.

“You’re getting more data, you need to be able to ingest, you need to be able to analyze it, you need to be able to build apps out of them — it’s not just enough to have a static report, or even a dashboard that people look at, people actually have uses for this data.”

From Java EE to Jakarta

After Oracle announced in 2017 that they were handing over Java EE to the Eclipse Foundation, a few changes began to take place. One was that Enterprise Java would now be called Jakarta EE.

In the second quarter of this year, the Eclipse Foundation announced that all Jakarta specifications with “javax.*” must be changed to “jakarta.*” This had the potential to significantly impact, and potentially harm, existing Enterprise Java applications.

It’s no surprise that developers were on DZone searching for the best ways to comply with these changes. We saw a lot of growth in all related topic tags, including Enterprise Java, which grew 10 times more popular, and Jakarta, which grew over 35 times more popular in Q2.

Apps and Cloud-Native Development

Cloud-native development is drastically changing the way we build applications. The term cloud-native refers to a style of container-based development that creates applications from scratch, or refactors older applications, to be fully optimized for the cloud. This is very different from older application development philosophies that retroactively adjusted apps to be cloud-enabled using methods like lift-and-shift or re-platforming.

According to this article on what it means to be cloud-native, the applications contain three major traits. They are container-centric, dynamically managed (i.e., containers are managed and organized by Kubernetes or other similar platforms), and microservices-oriented.

From Q1 to Q2, many aspects of cloud development showed steady growth. However, topics specific to cloud-native development grew exponentially, with the term itself showing over 300% growth last quarter.

Related tags that also showed an increase included:

  • Cloud-based microservices (500%)
  • Cloud-native deployment (140%)
  • Scaling microservices (160%)

Q2TopicTags_4

“The cloud-native ecosystem will see explosive growth as the growing adoption of Kubernetes will translate to a growing need to make it manageable for the enterprise. There are both huge gaps in tooling and many unrealized opportunities in making fleets of microservices more manageable — and we expect to see projects sprout up to handle both the gaps and the opportunities,” said Gwen Shapira, Data Architect at Confluent, in this interview with DZone.

Is Kubernetes the King of Development?

Kubernetes is the largest cloud-native platform designed to manage, scale, and deploy containers. As cloud-native development continues to grow, so does interest in Kubernetes.

“Kubernetes is a game changer. It’s slowly taking over the way the Internet works as far as application development and deployment,” explains Bob Reselman, an industry analyst and technical educator.

“It’s all changing so fast. Every five years, the stack is changing. Because of this, developers are finding themselves in a constant state of adaptivity and looking for the next best tool and ways to manage, scale, and deploy applications.”

Not surprisingly, we found topic tags related to Kubernetes and Kubernetes deployment showing tremendous growth from Q1 to Q2, including:

  • k8s — another name for Kubernetes (7X more popular).
  • Kubeadm — a fast-path command for creating a Kubernetes cluster (151%).
  • Kubelet — a command that runs individual Kubernetes nodes (118%).
  • Kubernetes services — rules and abstractions for Kubernetes pods (85%).
  • GKE — Google Kubernetes Engine (66% growth).

Open Source Topics Remain Popular Developer Interests

Nearly all of the above-mentioned topic tags contain one common theme: Open source.

The open-source topics that saw the most growth include:

  • Open APIs (11X).
  • Open source big data tools (200%).
  • Open source communities (110%).

Additionally, we saw growth in a wide range of open-source tools and platforms — some mentioned above, like Kubernetes, Apache, Jakarta, RSocket, and many more.

As the stack continues to change and evolve, developers will seek out open-source software first. Without question. Kubernetes, data tools like Apache Spark and Kafka— all open source, all dominating the ecosystem and rank high in developer interests.

Q2TopicTags_6

“I believe enterprises will increasingly turn to managed platforms delivering 100% open-source technologies in 2019, as they increasingly seek to avoid the vendor and technological lock-in that remain too common with proprietary open source offerings,” explained Ben Slater, CPO at Instaclustr, in an interview with DZone late last year.

“Given the fact that commercialized open-source technologies can leave enterprises at the mercy of price increases (and make it impossible to run solutions on their own or implement useful modifications), fully open-source technologies offer a compelling alternative.”

“Open-source solutions are empowered by engaged communities that help ensure rapid improvements and bug resolution, better security, full transparency and reliability, and a faster time to market at a lower cost.”

About the author:

Lindsay is a Content Coordinator at Devada. She works closely with contributors to DZone, a website for software developers and IT professionals to learn and share their knowledge. Editing and reviewing submissions to the site, she specializes in content related to Java, IoT, and software security.

Categories
Tools

Choosing a Javascript charting library in 2016

Given the overabundance of tools available to a Javascript developer in 2016, finding and choosing the right one is often a challenge.

Especially when it comes to visualising data, either drawing animated charts or implementing custom interactive infographics, the choice becomes harder since there are a lot of tools out there: Wikipedia’s “Comparison of Javascript charting frameworks” currently lists 44 different libraries, jsgraphs.com currenty stands at 72(!) different charting tools and – to make matters worse – the Google search result for “best javascript charting libraries” over-delivers with approximately 786K results, out of which the first 20 results are all links with titles like “{{integer in multiples of 5}} best javascript libraries”.

Naturally, most of the relevant Stackoverflow questions of the”what is the best tool” nature are “closed as not constructive”.

In this article I aim to help with the above challenge by means of a slightly unconventional approach: In my research I tried to quantify the merits of the most popular libraries, given a series of “developer-friendly” metrics.

Sounds weird and subjective? It is. Read on.

Step 1: Understanding declarative vs imperative approaches

Before we start comparing, it is essential to understand how almost all of the available libraries can be split into two distinct categories based on their approach. Let us borrow from classical computer science and use the “declarative vs imperative” paradigm comparison for this.

The declarative approach

The majority of JS charting libraries follow the declarative paradigm: You write code that describes what you want to end up with, and the library ensures it happens dealing with all the minutiae.

FusionCharts, Highcharts, amCharts, Chart.js etc. all follow this approach: You pick a chart type (column, bar, pie), you specify a configuration object and the library outputs a nice looking interactive chart based on your wishes.

The imperative approach

On the other hand, tools like D3.js, Paper.js or Snap.svg follow the imperative paradigm. They provide you with helper methods which you then need to use to write code that visualises your data step-by-step.

For example, to create a bar chart with D3.js you will need to initialise the canvas, calculate where to draw the axis, draw the axis, calculate where to draw the columns, draw the columns, the legend, the point data, add the events etc.

It does feel a little counter-intuitive to choose any tool that follows the imperative approach, until one sees the amazing work implemented by Mike Bostock (creator of D3.js) for the New York Times, with animated interactive infographics such as the 2012 “512 Paths to the White house” to understand how powerful a library like D3.js can be.

In the next step I made an effort to establish “winners” in each of those two categories.

Step 2: Quantify the popularity of each tool

2.1 Mentions in Google articles

One needs to start from somewhere so my first step was to revisit the “best javascript charting libraries” Google search, filter out to show results only from the past year, open the first 20 hits (first two pages) and note down which libraries were mentioned.

95 libraries in total – see the “Mentions” tab of this Google sheet for a full reference (it’s the first tab – and yes, there exists a library called “Aristocharts”. Seriously).

I then filtered out the list to include only the libraries with at least 4 hits:

javascript-chartinglibrary.1png

Some surprises here already. No amCharts? No mention at all of Paper.js or Snap.svg? Interestingly, the fact that N3-charts “made the cut” here can be construed I believe as a testament to the popularity of Angular.js.

What comes as no surprise is the popularity of D3.js. It is also the only library in this list that follows the “imperative” paradigm making D3.js the clear choice when it comes to that approach. I marked D3.js as the “winner” in the imperative category, filtered it out and continued.

2.2 Licensing

Next step was to establish the license of each library. Many developers are partial to tools that are either open source or come with really relaxed licensing.

Library-mentions-license-visionmobile2

Some reading is involved if one wants to make sense of Google custom license for their “Charts” library (it’s free with a lot of caveats). Interestingly ZingChart offers a free (albeit watermarked) version of its library.

Please note that personal bias prevented me from filtering out the commercial offerings at this stage. I’ve used both Highcharts and FusionCharts in the past to great success and as a result I opted to not judge based on price – until I had all the metrics that is.

2.3 Github stars and watchers

Does the library have a repo in Github? And if yes, how many people are watching, how many have starred it? I intentionally steered clear of other Github measurements such as number of contributions or PRs since each repo has owners and each owner has his / her own personal approach towards how “open” they are to contributions.

On the other hand, a project’s star rating is a clear indication of how many developers (rather than simply users) “like” it. The “Watch” metric also tells us the number of devs who actively want to be notified when new things happen in the project.

javascript charting library

Google Charts had a few github repos but they were for projects that wrap / package their “Google Charts” project, e.g. GoogleWebComponents.

Two things stood out for me in the table above. The massive community support for ChartJS  – almost the same number of stars as Backbone.JS (!) – as well as the number of people who are watching Chart.js.

What is also surprising is that out of the three commercial offerings, Highcharts number of stars is orders of magnitude higher than the rest.

2.4 Stackoverflow tagged questions

Another metric that can tell us how many people are using a library is to see the number of questions that are “tagged” in stackoverflow for a specific library.

This is not foolproof – one might argue that extremely well designed and structured libraries will be so intuitive in their use that people will not be asking questions about them – but my personal experience has shown that even the simplest of tools generate a lot of questions when used by a lot of people.

javascript charting library comparison

Is Highcharts the most difficult library of all? Or is it perhaps the most widely-used one? Perhaps its developers are extremely responsive to the questions of the community? We cannot answer this with 100% confidence. What these counts show however, is that highcharts has more tagged questions in Stackoverflow than all rest of the libraries combined.

Choosing the winners

Since I’ve already opted for a (completely) subjective approach, what better way to pick a winner by simply…. adding everything up.

Here are the top 5 libraries sorted by “Score”, i.e. the sum of github stars, github watchers and stackoverflow tagged questions:

javascript charting library comparison

Full data available on the “Final results” tab of this Google sheet.

Declarative approach – Open source – Chart.js

If the Github stars are anything to judge by, there is a lot of developer enthusiasm for what Chart.js offers.

The documentation is clear and concise – http://www.chartjs.org/docs/ – with several inline examples, browser support is solid (as long as <canvas> is supported chart.js will work – this means no IE8 and some inconsistencies on <= IE 10) and the 8 chart types it offers should be more than enough for most needs.

Declarative approach – Commercial – Highcharts

I’m an avid user of Highcharts and I was pleasantly surprised to see it “rising” in the ranks of my little quantitative experiment. The massive number of stackoverflow questions clearly signifies that despite its commercial nature, the community uses it… a lot. The high number of github stars – (I repeat: for a commercial project) – is also quite indicative of the “developer feelings” for Highcharts.

The documentation is stellar (with a really powerful “Demo” showcase where every single example is linked to a working JSFiddle), the API browser / reference is a great resource and browser support is not an issue since Highcharts auto-falls back to VML rendering for older IE browsers.

Check out this JSFiddle to see how easy it is to visualise a table like the one shown in “Choosing the winners” above:

choosing a javascript ibrary

Imperative approach – Open source – D3.js

The central principle of D3 is to enable developers to programmatically construct SVG objects and render them as they see fit. As long as you can visualise it, D3.js can help you (a) draw it, (b) make it interactive and (c) animate it.

D3.js is the tool to use when a charting library simply won’t cut it. And the community demonstrates this very clearly:

Github stars – 54848
Github watchers – 2653
Stackoverflow tagged questions – 22036

If I had to score this the same way I scored the charting libraries, then D3 leaves everyone behind by a factor of 2 with a score of 79537.

Categories
News and Resources

Angular team announces final release of version 2.0

Welcome to DeveloperEconomics’ weekly news roundup. In this edition, Google announces the release of Android Studio 2.2, Oracle confirms rumours of a Java EE 8 delay and Microsoft has been crowned the new king when it comes to open source contributors. Read on for the full news rundown.

Google app ads beat Facebook with 3 billion installs

Google says its ad products are now responsible for more than three billion app install ads. The announcement follows Facebook’s claim in April that its ads have generated over two billion installs. Google says it’s also experiencing a decline in average ad prices, down 9% year-on-year, due to the continuing growth of YouTube ads.

Microsoft has most open source contributors, says GitHub

Microsoft has beat Facebook to become the organisation with the most open source contributors on GitHub. Microsoft racked-up 16,419 contributors, beating Facebook’s 15,682 and Docker’s 14,059. GitHub’s report also found that JavaScript is the most popular language, Font Awesome is the repository with the most open source contributors and Homebrew is the repository with the most users reviewing code.

Java EE 8 not ready until end of 2017

Oracle says the release of Java EE 8 will be delayed until the end of next year. The delay, which was rumoured for some time, was announced at the JavaOne conference last week, where a new roadmap was proposed. Oracle now plans to release Java EE 8 with basic microservice and cloud capabilities, before releasing EE 9 sometime in 2018 with more features.

Affectiva emotional analytics platform now free for indie devs

Start-up Affectiva is allowing any company that earns less than a million dollars a year to use its SDK and API. The Affectiva platform uses “emotional analytics” to analyse user sentiment via chatbots or surveys. The company also announced a partnership with Giphy, which will see Affectiva encode Giphy gifs for sentiment analysis.

Angular team announces final release of version 2.0

The Angular team has announced the final release version of Angular 2.0. The new version of the JavaScript framework features better support for modern browsers, modular functionality that makes it easier to use third-party libraries, and is recommended for use with Microsoft’s TypeScript. Google also says it will provide devs with more guides to learn Angular 2.0 faster.

Android Studio 2.2 released

Android Studio 2.2 is now available to download. The update brings a significant number of new features, including an improved layout editor, an activity recorder that generates Espresso code for automated testing, and an emulator that can simulate data from different sensors. The new IDE also boasts an APK analyser, GPU debugger and much more.

GitHub announces project management tools and support for formal reviews

GitHub has announced the “biggest update yet” to its platform, bringing project management features to the table. The built in Trello-like project management tool lets users move cards with pull requests and switch cards between columns such as “in progress” and “done.” GitHub also now lets devs formally approve all pull requests and leave review summaries.

Kochava releases free version of app analytics tool

Kochava has launched Free App Analytics, a tool to measure and optimise app ad campaigns. The free tool lets devs optimise campaigns across big networks such as Facebook, Google, Amazon, Twitter and Snapchat. The tool also includes a global index of integrated ad networks. However, features such as scaling are only available in Kochava’s paid Enterprise offering.

Microsoft opens Desktop Bridge for Win32 app conversion

Microsoft’s Destktop Bridge is now ready to use, allowing devs to repackage desktop apps, including Win32 apps, for the Window Store. The Desktop Bridge also converts apps to the Universal Windows Platform, allowing Win32 apps to run on any device running Windows 10. Microsoft says the bridge has already been used by the likes of Evernote, Arduino IDE and doubleTwist to bring full featured apps to Windows Store.

Oracle announces ‘drag and drop’ chatbot platform

Oracle has unveiled a new platform for building and running chatbots. The tool doesn’t require any coding experience – featuring a drag and drop graphical interface – and is positioned an easy-to-use bot builder for enterprises. According to Oracle, its bots will work with all modern messaging platforms, such as Facebook, Slack and Kik.

Google acquires API.AI bot building start-up

Google has bought API.AI, a start-up that provides dev tools for building conversational bots. According to Google, over 60,000 developers are using API.AI’s tools to build conversational experiences for environments such as Slack, Facebook Messenger and Kiki. The terms of the acquisition have not been disclosed.

Categories
Platforms

Using Bash in Windows – today

bash_windows
using bash in windows today

“… However, when we talked with web developers, they still struggled with using Windows as their primary devbox.”

The above quote is from Kevin Gallo, the VP of Windows Dev platform, and was delivered around mark 0:38 of his presentation in Microsoft’s Build 2016 keynote. He then continued with the observation that “… many of them have workflows which rely on open source command line tools, scripts and frameworks”, and finished with a slide that his audience was – at first – slightly unsure on how excited to get about: Bash is coming to Windows.

Screenshot #1: Kevin Gallo’s slide from Build 2016 announcing Bash coming to Windows
Screenshot #1: Kevin Gallo’s slide from Build 2016 announcing Bash coming to Windows

If you let the video play for another 7 seconds, you’ll also catch a glimpse of Gallo’s audience. You can see the emotions depicted on their faces form a picture that explains perfectly the complex (and sometimes tumultuous) relationship of Microsoft with Linux and the Open Source world. Three persons are smiling excitedly and beginning to slow clap (the ones that suddenly realise how much easier managing their OS stack or scripting their Windows environment will become). You then have the classic cautious indifference of the majority of developers that wait to see whether this is “worth getting excited about”. Finally, you can also detect some unguarded annoyance from the fanboy crowd (“Seriously? I have to sit and hear about Bash? What’s wrong with PowerShell?”).

Personally, I belong to the first group. Despite working with open source technologies since the beginning of my professional career back in 2003, I’ve never managed to move away from Windows. To this effect, when I saw Rich Turner and Russ Alexander casually doing a apt-get install git on Windows to install git, I was excited. A lot.

But until the functionality showcased in the video above is mature and stable enough to be rolled out, I’ll continue using my current workflow which has served me faithfully since 2011: And that is bash on Windows (To be precise: A more “cut down” version of Bash. Read on for details).

The challenge: Production-strength command line workflow in Windows.

One might argue that Windows was never meant to be “driven” from the command line.

Microsoft tried to mitigate this back in 2006 by rolling out PowerShell, a shell and scripting language that gives users full access to their whole Windows environment. For Windows devs this was a great extra tool but for all other developers it was still not enough to lure them away from the power and versatility they found on the Linux command line.

Add to this the strongly opinionated naming conventions and approaches that PowerShell inherited from the .NET Framework (did you know that cd is but an alias to the “proper” command which is Get-ChildItem? That’s camelcase _and_ a dash that autocompletes with tab even if you type it in lowercase. Strange stuff) and you can see why it’s really hard for e.g. a PHP developer to consider it for his dev workflow.

When every single blogpost or article or tutorial written about a subject, e.g. “how to rebase branches in git”, includes instructions and screenshots that clearly demonstrate the flow in a Linux shell, it’s only natural for the developer to assume that this is the correct way of doing things.

Towards a solution: Install Git for Windows

For my frontend-with-a-bit-of-PHP-but-from-a-Windows-OS workflow I always relied on certain “battle proven” tools. WinSCP was the weapon of choice when files needed to be moved from one place to another (either via FTP, SFTP, SCP or even rSync). Putty allowed me to connect via SSH to all my dev boxes. TortoiseGIt ensured that I could use git directly from my Windows explorer interface.

The first “lightbulb / aha” moment for me occurred when I installed Git for Windows after being prompted to “try it out on the command line” by a colleague.
One of the steps of the install wizard prompts you to choose “How would you like to use Git from the command line?”:

Screenshot #2: Choosing how to use Git for Windows
Screenshot #2: Choosing how to use Git for Windows

… and it mentioned “Bash”!

Installation completes and suddenly I get a shell in Windows that looks suspiciously similar to what I’m used to in Linux or iOS installations:

Screenshot #3: MinTTY terminal emulator window
Screenshot #3: MinTTY terminal emulator window

Bash in Windows: How it works

Kudos? To the awesome devs that worked to bring Git to windows – https://git-for-windows.github.io/.
In essence the installer sets up a unix-like shell environment (MinGW – “Minimalist GNU for Windows”) which – very roughly speaking – creates the needed Unix layer that shells like Bash can run onto.
A terminal emulator called MinTTY is also installed (shown in screenshot #3 above) which is a Windows program that runs the Bash shell which in turn enables you to use quite a good subset of the Linux commands needed for an open source dev workflow.

Looks are important

… especially if you are an ex-designer-turned-frontend-developer. Going from the black and white severity of cmd.exe (where you could not even resize the window to the dimensions you wanted) to MinTTY definitely boosted my “developer happiness” feeling:

Screenshot #4: MinTTY terminal emulator window
Screenshot #4: MinTTY terminal emulator window

In the above example, I manually mapped the colours from the famous Solarized colour theme to the default 16 ANSI colours. For the font I chose the crystal clear Consolas font set at 12 point, although I’ve recently been experimenting with Adobe’s Source Code Pro as an alternative.

The MinTTY window can be resized to any dimension of your choosing. You can also use the same shortcuts as you use in the browser to resize the text on the fly (CTRL+plus, CTRL+minus or CTRL+mouse wheel). Finally you can launch as many instances of MinTTY as you want, enabling you to lay out a series of windows into your codebase and file structure, exactly as it suits you:

Bash in Windows Screenshot #5: Multiple instances running at the same time at different dimensions and font-size
Screenshot #5: Multiple instances running at the same time at different dimensions and font-size

I can now do {{thing}} from the command line

The list below demonstrates just a small subset of the stuff you can do with Bash in Windows that I found particularly useful and / or helpful.

  • Git
    No more “download and unzip”. Git clone any repo of your choosing in any directory in your filesystem. The handy “GIT Bash here…” shortcut that appears when you right click any folder is particularly useful here.
  • Linux command line
    MinGW supports a subset of the various commands and programs available in Linux, things like awk, sed, grep, find are all here, ready to be used. Shortcuts are also available (CTRL+U, CTRL+K for inline editing, CTRL+R to lookup on Bash history etc) as well as piping and redirection.
  • SSH
    OpenSSH works right out of the box. Set up your keys by using ssh-keygen (exactly the same way you would do in a Linux box) and then connect to any of your machines. You can also setup an ssh-agent (exactly the way Beanstalk or Github or Bitbucket explain in their online tutorials) to ensure you don’t retype your password all the time. Of course ftp and scp are available as well.
  • Vim
    No more notepad++ for me. After I went through the steep-as-mount-Everest learning curve I found out that vim was the best tool for quick text edits (I’ve strongly resisted the urge to play with emacs. We’ll see).
  • Bash scripting
    The very first bash script I experimented with (and use constantly nowadays) is z: https://github.com/rupa/z. I no longer rely on lengthy cd statements such as:
    cd /some_directory/nesting/nested/my_work
    But rather do a:
    z my_work
    … and I’m immediately taken to the directory I want.

“You should really switch to {{enter Linux distro name here}}”

Indeed. But even if I do so, there is still a vast number of devs out there who still need / have to work with Windows. One year ago, Isaac Schlueter (co-founder and CEO of the Node Package Manager – NPM) had this to say:

Bash in Windows: this matters
If you want devs using your code, this matters

Until WSL is out … Bash in Windows

The soon-to-be-released Windows Subsystem for Linux is a brilliant (and much-needed) step forward in making the Windows environment a first-class citizen for open source development workflows.Nevertheless, there is no need to wait for Microsoft to make WSL available to everyone.

I’ve been using Bash in Windows – in my daily workflow – for the last 5 years and it’s working like a charm.
If you want to do the same, simply install Git for Windows.

Categories
Platforms

Developers: builders or explorers?

What do you think about when you hear the word “software developer”? Most people probably imagine a duffy engineer, turning his boss’s requirements into code. A software builder, so to speak. But developers are so much more. They’re often more like adventurers and explorers, boldly going where no programmer has gone before. This was never more true than at the eve of the Internet of Things. The most important role of Internet of Things developers is to explore new possibilities. The technology is widely available; in no small part because of open source software and hardware projects. Now we need to learn where we can take it. We can build it, but should we?

Why are Explorers so important?

Explorers are critical to any developer ecosystem, including in the Internet of Things. First, because that’s where all the truly new, out-of-the-box ideas come from. It’s hard to be super-innovative when you have a project to deliver to your boss or client. Only by exploring seemingly crazy ideas can the Internet of Things reach its full potential. The open source ecosystem is often the area where these ideas bloom.

Secondly, while exploring, Explorers gain a tremendous amount of experience. This will help them build their careers (as builders or otherwise). It also helps the companies that pay the bill. And it is needed. In Q4 2014, VisionMobile surveyed 4,000+ IoT developers. The lack of hardware development skills was the top challenge among IoT developers. 48% of IoT open source enthusiasts (those who find it important to use an open source platform) listed it as a challenge.

Learning and open source

VisionMobile’s data also shows that exploring, learning and open source technology go very well together. Among Explorers (developers that are primarily interested in gaining experience to seize on future opportunities), 20% value open source platforms and technology. That’s the highest level of any group. Conversely, Explorers are the biggest group among open source enthusiasts (32%).

Furthermore, open source is popular among developers that are new to IoT and new to software. A second group who value open source are seasoned software developers who bring open source business models to the Internet of Things. Traditional IoT developers with lots of experience underuse open source. In a way, these “experienced in software, new to IoT” developers provide another kind of ecosystem-level learning.

More data

Here are some more key insights about IoT explorers and open source enthusiasts that we summarized in an infographic, co-created with Arduino:

  • Open source is not just useful for building skills. It is also used by developers that want to increase efficiency (we call them Optimizers) and by developers that work on commission (Guns for Hire). This indicates that open source tools get the job done quickly, efficiently and inexpensively. On the other hand, developers are cautious about using open source technology in commercial products.
  • Open hardware in particular helps IoT developers to address their 3 main challenges: a lack of hardware skills, immature tools and high production costs. Arduino is clearly a leader in this space.
  • “Open” seems to be a professional philosophy that is applied on hardware, software and protocols alike. 60% of open source enthusiasts feel that open standards are missing from IoT, compared to 44% of other IoT developers.
  • All this doesn’t mean that open source has won everywhere. Some verticals, e.g. wearables, seem more difficult to address with open source technology and are therefore less popular among open source enthusiasts. Sometimes open source platforms struggle in the face of strong closed-source competitor. Smart Home platform OpenHab is a good example.

In conclusion, developer-explorers are critical to any developer ecosystem, and open source technology is an important tool to make that happen. I for one can’t wait to see what these modern-day Marco Polo’s and David Livingstone’s discover next!

open-source-enthusiasts-iot