Software Development Trends in South Asia#
Project Overview#
The software development industries in Bangladesh, Sri Lanka, and India have shown dynamic growth over the past decade. This analysis aims to provide a comprehensive, data-driven understanding of the trends, challenges, and opportunities within the software development sectors of these countries. Using data from various sources, including the GitHub Innovation Graph, the analysis will explore key metrics related to software activity, collaboration, and industry growth.
The report is structured as follows:
Data Collection: Details on the datasets used for the analysis, including their sources and relevance to the software industry in each country.
Data Analysis: An exploration of the key trends identified from the data, highlighting differences and similarities between the three countries.
Conclusions: Insights drawn from the data analysis, with potential implications for the future of software development in the region.
Data Collection#
The data for this analysis is sourced from multiple publicly available datasets and repositories, primarily focusing on software development activity. The GitHub Innovation Graph is the primary source, providing structured data on developer activity aggregated by economy. Additional sources include government reports and industry publications from each country.
ID |
Dataset |
Description |
License |
Access |
|---|---|---|---|---|
1 |
GitHub Innovation Graph |
Data on Git pushes, developers, organizations, repositories, and other software development metrics. |
Open |
Units of Analysis#
Git Pushes, Developers, Repositories, Collaborators: For these metrics, the unit of analysis is Country-Quarter. This means that the data is analyzed based on the country and the quarter, allowing for the observation of how Git pushes, the number of active developers, the growth of repositories, and international collaborations evolve over each quarter in each country (Bangladesh, Sri Lanka, and India). This approach enables both temporal and geographic comparisons, highlighting trends within each country over time.
Languages: In this case, the unit of analysis is Country-Programming Language-Quarter, meaning that the data is analyzed by country, programming language, and quarter. This allows for a detailed view of which programming languages are most popular in each country and how their usage has changed over time. It enables a comparison of the adoption of languages such as JavaScript, Python, Java, and others across countries and quarters.
Key Metrics Analyzed#
Git Pushes: The number of code contributions made by developers.
Developers: The total number of active developers per country.
Repositories: Growth in the number of software projects hosted on GitHub.
Languages: The most popular programming languages used. The top 10 languages—JavaScript, Java, Python, TypeScript, C++, PHP, Ruby, C#, and Go—were selected based on various rankings and the number of available observations (Coursera Staff, 2024; DataCamp, 2023; Rice CS, 2023). Other languages like SQL, R, Julia, and Matlab were excluded due to insufficient data, limiting their comparability.
Collaborators: Cross-country collaborations and open-source contributions.
Data Analysis#
1. Developer Activity#
Sri Lanka: Leads with over 1,500 developers per 100k inhabitants by 2024, reflecting a highly active and engaged tech community.
India: Shows steady growth, with nearly 1,000 developers per 100k by 2024, highlighting its massive, growing developer base.
Bangladesh: Trails behind but shows steady progress, growing from just over 200 developers per 100k to over 500 by 2024.
2. Repository Growth#
Sri Lanka outpaces the other two countries, exceeding 4,000 repositories per 100k inhabitants by 2024.
India follows closely, with over 2,500 repositories per 100k, while Bangladesh lags with just over 1,000 repositories per 100k by 2024.
3. Git Pushes#
Sri Lanka also leads in Git pushes per capita, surpassing 2,500 pushes per 100k by 2024.
India shows steady growth, nearing 1,000 pushes per 100k.
Bangladesh has lower activity but is growing, reaching around 600 pushes per 100k by 2024.
4. Organizations#
Sri Lanka has the highest number of organizations per capita, with over 90 per 100k by 2024.
India follows with over 50 organizations per 100k, while Bangladesh shows slower growth, nearing 30 organizations per 100k by 2024.
5. Programming Languages#
JavaScript dominates in all three countries, with particularly rapid growth in Sri Lanka.
Python and PHP show steady increases across the board, with Python gaining significant popularity in India and Sri Lanka for data science and machine learning.
Disruptive Tech Skill and Tech Skill#
Language |
Skill Group |
Justification |
|---|---|---|
HTML |
Tech Skill |
Markup language used for web development. |
CSS |
Tech Skill |
Styling language for web pages; not associated with emerging technologies. |
JavaScript |
Tech Skill |
Enables complex web and app development; part of AI and ML pipelines via frameworks like TensorFlow.js. |
Python |
Disruptive Tech Skill |
Core language in AI, machine learning, and data science. |
Shell |
Tech Skill |
Used for scripting and automation; not directly linked to emerging technologies. |
PHP |
Tech Skill |
Server-side web development language; mature, not disruptive. |
Java |
Tech Skill |
General-purpose; widely used in backend systems and Android development. |
TypeScript |
Tech Skill |
Structured superset of JavaScript; widely used for scalable and modern application development. |
SCSS |
Tech Skill |
CSS preprocessor; improves styling but unrelated to emerging technologies. |
Dockerfile |
Tech Skill |
Used in DevOps and containerization; not directly a disruptive technology. |
C |
Tech Skill |
Systems programming; used in embedded systems but not inherently disruptive. |
Ruby |
Tech Skill |
Scripting/web development language; mature and not tied to disruptive trends. |
C++ |
Disruptive Tech Skill |
Critical for robotics, embedded AI systems, and high-performance computing. |
C# |
Tech Skill |
Enterprise software and game development; not central to emerging disruptive tech. |
Kotlin |
Tech Skill |
Modern Android development language; not central to emerging tech. |
Objective-C |
Tech Skill |
Legacy language for macOS/iOS; not associated with emerging technologies. |
Swift |
Tech Skill |
iOS/macOS development; modern but not inherently disruptive. |
Makefile |
Tech Skill |
Build automation; essential but not tied to emerging tech. |
Vue |
Tech Skill |
JavaScript framework for front-end; widely used but not disruptive on its own. |
Jupyter Notebook |
Tool |
An essential tool for AI workflows, data science, and reproducible research, enabling interactive coding and visualization. |
Disruptive Tech Skill and subcategories#
Category |
Language |
Justification |
|---|---|---|
AI_ML |
Python |
Widely used in AI/ML, data science, and scientific computing. |
R |
Preferred in statistics and data analysis; used in ML pipelines. |
|
Julia |
High-performance numerical computing, increasingly used in ML research. |
|
TensorFlow |
Leading deep learning framework for building and training ML models. |
|
PyTorch |
Popular for research and production ML workflows; dynamic graphing. |
|
CUDA |
GPU programming model essential for ML acceleration on NVIDIA GPUs. |
|
MLIR |
Compiler framework for optimizing ML computations at multiple levels. |
|
Stan |
Probabilistic programming language for Bayesian inference and modeling. |
|
Blockchain |
Solidity |
Main language for writing Ethereum smart contracts. |
Move |
Secure-by-design language used in projects like Diem and Aptos. |
|
Cairo |
Language for zk-STARK-based applications on StarkNet. |
|
Vyper |
Simpler, more secure alternative to Solidity for Ethereum contracts. |
|
Yul |
Intermediate language for Ethereum smart contract compilation. |
|
Clarity |
Language used in Stacks (Bitcoin smart contracts); decidable language. |
|
QuantumComputing |
Q# |
Microsoft’s language for quantum algorithms with QDK support. |
Silq |
High-level quantum language with strong type safety. |
|
QML |
Functional quantum programming with focus on expressivity and safety. |
|
Quil |
Low-level language for expressing quantum programs for Rigetti systems. |
|
Graphics_AR_VR |
ShaderLab |
Unity’s language for defining GPU shaders. |
HLSL |
High-level shading language used by DirectX. |
|
GLSL |
Shading language for OpenGL used in graphics rendering. |
|
Metal |
Apple’s low-overhead GPU-accelerated graphics and compute language. |
|
WebGL |
JavaScript API for rendering interactive 3D graphics in browsers. |
|
Robotics_IoT |
ROS |
Framework for writing robot software, widely adopted in robotics R&D. |
URDF |
XML format for representing robot models in ROS. |
|
RobotFramework |
Test automation framework often used in robotic process automation (RPA). |
|
Arduino |
C/C++-based language for embedded systems and IoT devices. |
|
MicroPython |
Lightweight Python implementation for microcontrollers and IoT. |
|
ModernSystems |
Rust |
Memory-safe systems programming language; popular in modern infrastructure. |
WebAssembly |
Portable binary instruction format for fast execution in browsers. |
|
Zig |
Systems-level language with safety and performance in mind. |
|
Nim |
Compiled language combining performance and expressiveness. |
|
Odin |
Data-oriented systems programming language. |
|
Crystal |
Compiled language with Ruby-like syntax; aimed at high performance. |
|
V |
Simplicity-focused compiled systems language. |
|
Go |
Google-backed language for scalable backend systems and infrastructure. |
|
DataEngineering |
Scala |
JVM language used in Spark for distributed data processing. |
Spark |
Unified analytics engine for big data processing. |
|
Flink |
Stream-processing framework for real-time analytics. |
|
Presto |
Distributed SQL query engine for big data. |
|
DataWeave |
MuleSoft’s language for transforming data. |
|
jq |
Lightweight command-line JSON processor. |
|
CloudNative |
HCL |
Configuration language used by Terraform for infrastructure as code. |
Terraform |
Tool for building, changing, and versioning infrastructure safely. |
|
Jsonnet |
Data templating language for defining complex JSON. |
|
CUE |
Language for defining, generating, and validating configuration data. |
|
Bicep |
DSL for deploying Azure resources declaratively. |
|
Pulumi |
Modern infrastructure as code using general-purpose languages. |
|
Nix |
Declarative configuration language for reproducible systems. |
|
Docker |
Platform for building, shipping, and running containers. |
|
Dockerfile |
Script format used to build Docker container images. |
|
Kubernetes |
Orchestrator for managing containerized applications at scale. |
|
FunctionalProgramming |
Haskell |
Purely functional language with strong static typing. |
Elm |
Functional language for web frontend development. |
|
Reason |
Syntax-friendly variant of OCaml used in web apps. |
|
ReScript |
Rebranding of Reason with a focus on JS interop. |
|
F# |
Functional-first language on the .NET platform. |
|
Elixir |
Functional language for scalable, fault-tolerant applications. |
|
Erlang |
Telecom-born functional language known for concurrency and fault tolerance. |
|
OCaml |
General-purpose functional language with powerful type inference. |
|
PureScript |
Strongly typed functional language that compiles to JavaScript. |
|
Idris |
Language with dependent types for verifiable software. |
|
Lean |
Functional language and theorem prover used in formal verification. |
|
Clojure |
Functional Lisp dialect for the JVM, focused on immutability. |
Programming Language Classification by Development Domain and Use Case#
Category |
Language |
Justification |
|---|---|---|
WebDev |
HTML |
Core markup language for creating web content. |
CSS |
Used for styling web pages. |
|
JavaScript |
Primary scripting language for interactive web apps. |
|
TypeScript |
Adds static typing to JavaScript; popular in large-scale web apps. |
|
PHP |
Server-side scripting language for web development. |
|
Ruby |
Used in web frameworks like Ruby on Rails. |
|
Framework for building dynamic web sites on Microsoft stack. |
||
Vue |
Progressive JavaScript framework for front-end development. |
|
React |
JS library for building user interfaces. |
|
Angular |
Web app framework maintained by Google. |
|
Svelte |
Compiler-based front-end framework for high-performance apps. |
|
MobileDev |
Swift |
Language for iOS/macOS apps. |
Kotlin |
Modern Android development language. |
|
Dart |
Language used with Flutter for cross-platform mobile apps. |
|
Java |
Legacy Android development language. |
|
Objective-C |
Legacy iOS/macOS development language. |
|
Flutter |
Framework for building natively compiled mobile apps from one codebase. |
|
React Native |
Framework using React for cross-platform mobile apps. |
|
DataScience |
Python |
Most widely used in data science, ML, and automation. |
R |
Statistical computing and graphics. |
|
Julia |
High-performance numerical analysis and ML. |
|
MATLAB |
Matrix-based language for numerical computing. |
|
Scala |
Used in Spark for big data analytics. |
|
TensorFlow |
Open-source deep learning framework. |
|
PyTorch |
Dynamic deep learning framework used in research and production. |
|
Jupyter Notebook |
Web-based interface for data science and visualization. |
|
SystemsProg |
C |
Low-level language used in systems and OS development. |
C++ |
Object-oriented systems programming and game engines. |
|
Rust |
Memory-safe systems language gaining popularity. |
|
Go |
Concurrent language used in scalable systems. |
|
Assembly |
Lowest-level human-readable programming language. |
|
Zig |
Modern alternative to C with safety guarantees. |
|
Ada |
Used in safety-critical systems like aviation. |
|
Fortran |
Numerical computing and scientific programming. |
|
GameDev |
C# |
Language for Unity game development. |
C++ |
Used in Unreal Engine and high-performance games. |
|
Lua |
Lightweight scripting used in game engines. |
|
ShaderLab |
Unity’s language for writing shaders. |
|
HLSL |
High-level shading language for DirectX. |
|
GLSL |
Shader language for OpenGL. |
|
UnrealScript |
Used in older versions of Unreal Engine. |
|
Godot |
Open-source game engine with custom language. |
|
GDScript |
Scripting language for the Godot game engine. |
|
DevOps |
Bash |
Scripting language for Unix shell environments. |
PowerShell |
Task automation in Windows environments. |
|
Python |
Widely used in scripting, DevOps, and automation. |
|
Shell |
Generic scripting used for command-line and automation. |
|
Groovy |
Jenkins pipelines and scripting on the JVM. |
|
Makefile |
Build automation tool. |
|
Dockerfile |
Used to define Docker container environments. |
|
HCL |
HashiCorp Configuration Language for Terraform. |
|
Terraform |
Infrastructure as code tool for cloud services. |
|
Ansible |
YAML-based configuration management and provisioning tool. |
|
Embedded |
C |
Dominant in embedded systems. |
C++ |
Used for hardware interfacing and embedded software. |
|
Rust |
Used in safe embedded system applications. |
|
MicroPython |
Python implementation for microcontrollers. |
|
Arduino |
Simplified C++-based platform for electronics. |
|
Verilog |
HDL used to model electronic systems. |
|
VHDL |
Hardware description language used in FPGAs and ASICs. |
|
SystemVerilog |
HDL extending Verilog with features for verification. |
|
Embedded C |
C customized for embedded systems. |
|
Scientific |
MATLAB |
Widely used in engineering and academia. |
R |
Statistical analysis and data visualization. |
|
Julia |
Fast numerical computation for researchers. |
|
Fortran |
Still used in physics and numerical weather modeling. |
|
Wolfram |
Used in symbolic computation and mathematics. |
|
Maple |
Mathematical computation and symbolic algebra. |
|
Maxima |
Free computer algebra system for symbolic operations. |
|
Database |
SQL |
Standard language for querying relational databases. |
TSQL |
Microsoft’s extension of SQL. |
|
PLSQL |
Procedural extension of SQL for Oracle. |
|
PLpgSQL |
Procedural extension of SQL for PostgreSQL. |
|
HiveQL |
SQL-like language for querying Hadoop data. |
|
GraphQL |
Query language for APIs and graph-based data. |
|
MongoDB Query Language |
JSON-like syntax for querying MongoDB. |
|
Functional |
Haskell |
Pure functional language with lazy evaluation. |
F# |
Functional-first language on .NET. |
|
OCaml |
General-purpose functional programming. |
|
Erlang |
Known for concurrency and telecom apps. |
|
Elixir |
Modern functional language built on Erlang VM. |
|
Clojure |
Functional Lisp dialect for JVM. |
|
Scala |
Blends OOP and FP; runs on JVM. |
|
Idris |
Dependently typed FP language. |
|
PureScript |
Strongly typed FP language for JavaScript. |
|
Reason |
Syntax-friendly OCaml variant. |
|
UI_UX |
CSS |
Styling content in web applications. |
Markdown |
Lightweight markup for formatting text. |
|
XML |
Markup language for data structure. |
|
XSLT |
Transform XML documents. |
|
SASS |
CSS preprocessor for better syntax. |
|
LESS |
CSS preprocessor with variables/functions. |
|
Pug |
Templating engine for HTML. |
|
Handlebars |
Semantic templating language for JavaScript. |
|
Blade |
Laravel’s templating engine. |
|
Twig |
Templating engine for PHP apps (used in Symfony). |
|
Legacy |
COBOL |
Business-oriented language still used in mainframes. |
Pascal |
Teaching language in early computing. |
|
Smalltalk |
OOP pioneer with niche educational use. |
|
Prolog |
Logic programming language for AI and linguistics. |
|
Lisp |
One of the oldest AI-focused languages. |
|
Delphi |
Visual language for Windows apps. |
|
Forth |
Stack-based embedded systems language. |
|
APL |
Concise language for matrix operations. |
|
Ada |
Safety-critical systems (military, aerospace). |
|
Modula-2 |
Successor to Pascal for modular programming. |
|
BASIC |
Beginner-oriented teaching language. |