My 1 year journey writing a complex software
After one year of developing CLUBASID, I will like to share my journey so far. CLUBASID is an infectious disease simulation tool as well as a data visualisation and data analytical tool. I started working on CLUBASID on the 15th of March 2020 as a full time solo project and still currently working on this software as a solo project. I want to share how this project evolved over time and how it got to where it is right now. The technology I used for this project was Unity engine (with the c# language and a sprinkle of HLSL). The technology choice made this project harder in some aspect and easier in other aspects. As Unity is primarily a game-engine, lots of things did not come out of the box and I had to implement them (e.g dialog boxes, graphs and file picker window all had to be built from scratch). On the other hand, I got Unity’s update loop which played an important role in the simulation aspect of this software
15th of March to 17th of March, 2020 (Minimum Viable Product)
On the 15th of march 2020, I started working on a software to simulate the spread of Covid-19 in a small population group. The idea was to have bunch of dots represent individuals in a population. White dots represented uninfected people, blue dots represented asymptomatic infected people and red dots represented symptomatic infected people. Once an uninfected person comes close to an infected person, the uninfected person has a chance (let us say X) of being infected where X is between 0 and 100. After Incubation period, the asymptomatic infected individuals start showing symptoms after which they could either recover or die. At this point the name of the software was “Covid-19 pandemic simulator”.
20th of March, 2020 (Real Time Graph and clusters)
By the 20th of march, I was able to group population into clusters. The boxes you see below represent boundaries for each cluster. I also added a real-time graph on the right to plot the result of what is going on (the graph still had bug issues at this time).
29th of March, 2020 (Working graph and charts)
As mentioned earlier, I am using Unity engine to build this software. The framework itself does not come with a chart UI. So I had to build mine from scratch. I started by making pie charts and bar charts. Also I was able to fix the bugs in the graph on the right.
3rd of April, 2020 (public places, Lockdowns and Isolation Centers)
So instead of individuals just moving aimlessly across all directions, I added features that will let you simulate individuals heading to a particular location, simulate the effectiveness of lockdown and also simulate individuals going to isolation centers when they start showing symptoms or when they are asymptomatic but have tested for the disease. During lockdown, individuals are only allowed to go to the location closest to them. This usually helps flatten the curve as infected individuals do not come in contact with too many people.
16th of April, 2020 (Graph Analysis and more)
I added features that would help the user able to analyse the plot on the right such as area under the curve. Area under the curve in this case simply sums up the infections everyday. Note that this number does not represent the total number of unique infections in the population. That would be calculated by number of infected + number of recovered + number of deaths.
At this point, I renamed the software to “pandemic simulator”.
19th of July, 2020 (Aerosol Spread)
It was at this point that I decided to change the name to CLUBASID (CLUster BAsed Simulation of Infectious Diseases). I added simulations for other modes of transmission such as aerosol, STDs, direct and indirect contact. The image below shows simulation of a disease that spreads through aerosol. The blue circles indicate infected area. For example, when individuals sneezes, infected particles travel through air and remain in the air for some time.
27th of July, 2020 (STI diseases)
In the STI simulation, individuals will mate with the opposite sex and a similar age based on familiarity. When an individual comes in contact with someone of the opposite sex and similar age, they start to “know each other”, by knowing each other, they both store each other’s IDs and the total amount of time they have known. Once that amount of time reaches a certain threshold, they both will mate. The threshold is dictated by a variable called “promiscuity level”. The higher the “promiscuity level”, the lower the threshold and vice versa. Also high usage of condom will reduce the chances of diseases spreading (dictated by the “condom usage” variable).
22nd of August, 2020 (Tool for adding large population easily as well as new graph)
A tool that will make it easy for the user to add large population (up to 50,000 people) easily with different variations that the user has control over (e.g. population per cluster, etc).
17th of September, 2020 (Vaccination and medical intervention)
It is not enough to be able to simulate the spread of a disease in a population. One must also be able to simulate the effects of vaccination or other forms of medical intervention such as curative medicines. I added features that would help the user vaccinate individuals to trigger anti-bodies that would prevent them from contacting the disease.
10th of November, 2020 (Multi-Restriction Simulation)
This might sound trivial but this was one of the more challenging features to implement in the simulation part of this software. This feature allows a cluster to have more than one restriction type. Previously, you could only have clusters with one restriction type (e.g lockdown, free movement, etc), now you can have a cluster follow lockdown rule for a certain amount of time and then change to free movement after a specific time period.
Major decision on the 10th of December, 2020
I made a major decision on the 10th of December, 2020. I asked myself, why a user cannot import data from a simulation they must have carried out using another software into CLUBASID and take advantage of the visuals in CLUBASID. I then thought that a CSV-file import would be good as that’s the standard format for most data. This goes beyond infectious diseases. As long as we are importing CSV data into the software. Such data could be financial data, medical data, etc. Whatever it is, you can import it and use it in CLUBASID. This was a major shift in development.
10th of January, 2021 (Data Visualisation)
CLUBASID is also now a visualisation tool. meaning that you can import your CSV file (containing numeric data) and visualise it using different charts. At this point, just pie chart, bar chart, correlation graph and animated scatter plot were available.
30th of January, 2021 (3d graph)
I added 3D graph (still in its raw form). You can easily see relationship between 3 attributes in the X, Y and Z axis.
15th of February, 2021 (more charts and bug fixes)
I added several charts including column bar chart, range chart, line chart and race bar chart. There was also serious improvement made on the 3d graph.
6th of March, 2021 (More Charts)
After little bug fixes here and there, I added radar (spider) chart, heat map, gauge chart and histogram.
11th of March, 2021 (Geo-Map Chart)
I designed a map of the world using blender (A software for 3D modelling and more). I then brought the 3d model of the map into Unity. The map currently contains about 156 countries.
Lessons so far
Design patterns and clean software architecture is very important. It has never been more important than ever. This project currently sits on over 300 class files and over 50 thousand lines of code. Some good tips i followed wholeheartedly include:
- Naming your variables and functions so well that you do not need comments to explain them. For example, if you have a function that gets the average salary of workers in a particular country, I would name that function getAverageSalaryOfWorkersInCountry(string country_name).
- Be strict with your naming conventions.
- Having UML diagrams for your software is very important.
- Design patterns are very important and I picked the ones that suited my project very well.
- Version control is very important. I like to save a new a version after significant changes have been made to the software.
- Testing! Testing! Testing! Not just unit test, but practically using the software just like an end user would.
- Constantly getting feedback from others.
- Add new features when necessary.
- Document every step of the way.
- Always try to refactor when necessary.
Why did I use Unity engine
The disease simulation aspect made Unity a good choice. As the simulation aspect required constant update (just like a game would), it needed to run on an update loop, it could not have been an event based system.
Also I have been using Unity for about 6 years and feel very comfortable using it. Also, the fact that Unity is originally a game engine makes it easier to implement some things such as 3d spatial graph.
Where is the product heading
Honestly, at this point, I just keep adding features upon features. The next step will be adding classification and clustering algorithms for machine learning on the data. After extra months of polishing, I see this software being used by lots of people for data visualisation and data analytics. The disease simulation aspect will continue to be a niche part, as there aren’t too many people that will need to do that (compared to the data visualisation and analytics aspect of the software).
CLUBASID is live and available on the Microsoft store. https://www.microsoft.com/en-us/p/pandemic-simulator/9n86x1xhz765?activetab=pivot:overviewtab