Eurovision for nerds: How fun it would be to make a Kibana dashboard of the winners

Reading Time: 3 minutes

Purpose

The purpose of this article is to layout a fun idea I have, which I hope will help me with my eventual goal of becoming an Elastic Certified Engineer.

What’s in it for you, dear reader?

If you would like to follow along how to make fun and informative visualisations of data, even if you have never heard of Kibana before, and even if you have no data, then this is for you.

🙂

What is Kibana, anyway?

Kibana is a window into the Elastic Stack. It enables visual exploration and real-time analysis of your data in Elasticsearch… [enabling] data exploration, visualization, and dashboarding.

https://www.elastic.co/webinars/getting-started-kibana?elektra=home&storm=sub2&iesrc=ctr

What is Eurovision, anyway?

Just in case we don’t share the same hobbies, the Eurovision Song Contest is an annual international song competition, primarily for European countries. It is colourful, kitch, and classy all wrapped in one giant live TV spectacle, and past winners include ABBA (1974) and Celine Dion (1988), with the favourite being awarded the now famous douze points (twelve points) from enough other countries to make them the winner.

Eurovision has been broadcasting since 1956, making it one of the world’s longest-running television programmes, with audience figures of between 100 million and 600 million internationally.

How on earth will I do this?

Things might change as I go, though for now I have outlined the following phases of the project. I have tried to make each phase as realistic as possible, for someone who has never done this before.

Phase 1

  • Create a raw data file
    • …perhaps in a google drive spreadsheet that can be exported as a .csv file, for the past 10 years of Eurovision winners. Do some copy/paste manually from sites such as Wikipedia, if I am not able to find some database with this already. Don’t try to make a data scraper or automate this, at this stage
    • store this in source control, eg github
  • Install elastic locally on my computer
    • …at home, and on my laptop, depending on where I am working from. Do not try to create a docker image or store this in source control at this point

Phase 2

  • Import this data file to Kibana
  • Create some visualisations
  • I will probably want to do some data enhancements at this point, such as adding more data, or converting the .csv to SQL or a more viable database solution

Phase 3

… in order to have a package that can be stored in the cloud, and thus used and run by anyone.

🙂

Blog, book, and do. Repeat.

I hope to blog, write and actually do this project at the same time, updating and revising the steps as I go. I hope anyone with an interest in this will be able to follow the steps.

I will write this up in detail on leanpub as I go:

https://leanpub.com/FantasticElastic

FANTASTIC ELASTIC Book cover, by Anita Lipsky
FANTASTIC ELASTIC: My Journey to Visualise Eurovision winners using Kibana dashboards

One more thing…

It would be amazing to get feedback along the way, so that I can make the steps as clear and as helpful as possible, for all dear readers. So please reach out and send any feedback or questions you might have.

After all, Elastic IS fantastic, just like the Eurovision song contest, and I think you will have fun with me on this journey!

Thanks for reading, and until next time!

Australia Released a Banknote With a Typo, And Nobody Noticed For 6 Months

Reading Time: 1 minute

2019 05 12 – Click to read full article

IMPACT: Well, the money is still valid, and will probably become collector editions… so just embarassment

ROOT CAUSE: typo / copy paste error … so not really a software bug… just so funny that a spell checker did not catch this

COMPANY RESPONSE: the misspelled notes already in circulation will eventually be replaced – even though that might take some time

Coles blames IT glitch after stores forced to shut their doors

Reading Time: 1 minute

2018 09 22 – Click to read full article

IMPACT: Coles supermarkets across the country were unable to open on time on Sunday

ROOT CAUSE: a “network error” / a “nationwide computer problem” / some minor IT problems … out of our team members’ control

COMPNY RESPONSE: Our store is currently closed due to circumstances beyond our control… We apologise
 

Mercedes find bug that robbed Hamilton of victory

Reading Time: 1 minute

2018 04 06 – Click to read full article

ROOT CAUSE: The issue isn’t… with the race strategy software… we found a bug in [an offline] [tool… used to create delta lap times] that meant that it gave us the wrong number… Had Hamilton known, he could have gone faster after his earlier stop to give himself a greater margin.

PREVENTATIVE MEASURES: an extensive analysis had been carried out and processes put in place “to make sure that we don’t have a repeat”

 

Oculus Rift re-enters virtual space after bad software caused a global blackout

Reading Time: 1 minute

2018 03 08 – Click to read full article

ROOT CAUSE: a software-based certificate expires

ROOT CAUSE: OculusAppFramework.dll file currently has an invalid, expired certificate, preventing Windows 10 from running the software needed to power the Oculus Rift

IMPACT: can’t drive the Oculus Rift to any virtual realm

IMPACT: faced a warning message when booted up, blocking use of the device: It can’t use the Oculus Runtime Service

SOLUTION: Oculus VR must renew the “invalid” certificate and redistribute the software with the new certificate intact

 

How to triage bugs. Or, avoiding “just one more quick bug fix” because that is really, really, really, really, really expensive

Reading Time: 4 minutes

What is the problem, exactly?

It is easy to fall into the trap of adding “just one more quick bug fix” into the build before sending the software out to the customer.  Doing this repeatedly can cause lengthy delays, as typically any change can impact other areas of the software, creating a snowball effect of changes that again need to be triaged and handled in some way.

Since time is money, delays to delivery cost the company money they did not plan to spend on that part of the project.  Also these delivery delays can impact on your relationship with your customer.  I believe the golden question to ask your customer is “how likely are you to recommend us?”… and if they are constantly irritated by unforeseen delays, they might be less and less likely to recommend, and even stop doing business with you.

 

But it will only be quick, and we are really, really sure this is the last one

The real issue, though, is triaging based on a description of a bug fix being “just one more quick bug fix”, which implies it is only up for grabs since it is “quick” and “just one more”.  This description does not mention the criteria that should be used when triaging.

 

But our test coverage is excellent, and code is only merged to the main pipeline after all automated tests are run

Good for you! Then in theory regressions should catch most important issues.  And if tests were added at the time of development, even better.  Perhaps this post isn’t for you then.

 

So what is the way to triage bugs?

I am a big advocate of using agile software development within a scrum team because it enables being able to deliver software regularly.  Even if it is not perfect, having a team triage session, which basically means having opinions from business, technical and QA perspectives*, all together at one time in one session, means the ability to quickly accept or reject tickets that will be worked on.

To ensure the team is not wasting company time, and thus a whole bucket load of money, it is important to reject tickets (features / improvements / bugs … and so on) , or otherwise scale the content of the work to be done in the ticket, into only the essentials.

Let me say that again.

It is important to reject tickets:  it is important to reject features, improvements, bugs…

 

How important is it to reject bugs?

Really, really, really, really, really important 🙂

 

How to quickly triage bugs, to ensure your software release does not become very expensive

In a session with your team, which implies the three perspectives* present, triage bugs based on risk of this bug going into production, which I assess using likelihood and impact.  Also triage bugs on importance of the feature to the customer, which I will call customer value.  And finally, like any ticket, think about the QA perspective, which is the risk of doing development work in this area, again using likelihood and impact.

 

  1. Risk of this bug in production, “risk of bug”: Likelihood of this happening in production
  2. Risk of this bug in production, “risk of bug”: Impact of this happening in production
  3. Customer value: Is this bug affecting any feature that the customer will be using?
  4. Risk of development work in this area, “risk of bug fix”: Likelihood of software development on this bug fix affecting other areas of the software in production
  5. Risk of development work in this area, “risk of bug fix”: Impact of software development on this bug fix if it affects other areas

 

If some of these terms sound familiar, yes, they are thanks to the ISTQB Agile certification I obtained.

 

Let’s go through an example

Bug: Bank statement shows the balance in grey instead of green, for balances that are a positive number

  1. Risk of bug, likelihood: High
    • 100% likely, as this is repeatable for certain scenarios.
  2. Risk of bug, impact: Low
    • Team assesses and realises this will only happen for a certain type of customer account, and these make up only 2% of the customer base.
    • Team checks production logs and can see that those customers mostly check balances using another screen.  In the last 24 hours, no customer has checked a balance using that particular screen.
    • The grey is the default text colour used for all text on the screen, so this does not stand out as unusual
  3. Customer value: Low
    • This grey text colour does not prevent the customer from checking their balance, so no.
    • In addition, the customers have been promised new features that will enable the bank to sell them new products through this new app, which will greatly increase the bank revenue.
    • The product manager has a lot of pressure to get that product out to market quickly, and need the team to focus on that as a priority
  4. Risk of bug fix, likelihood: High
    • QA states this is in an area of code with little UI test coverage, since it is a new app.  So fixing this bug will most likely cause new bugs in this area
  5. Risk of bug fix, impact: High
    • Developer states the impact of changing the style here is that is could render some of the balance unreadable, due to a high amount of business logic in the UI instead of the api, as this new app was rushed to market.  The balance should always be readable, so the impact is high

 

I guess you should be seeing the obvious here, which is this ticket should be rejected.

Therefore, the perspective of triaging on “how long it takes to fix” can generate the wrong answer, since this is the kind of bug fix that typically the development team will say is “just five minutes to fix”.

Conclusion

Imagine that you have another five to ten tickets like this that are up for triaging.  It is easy and tempting perhaps to fall into the trap of accepting them all, if thinking they will “just take five minutes to fix”.  In reality, any change has an impact.  In addition, the “fix” might take “just five minutes”, but what about the building of the pipeline, the retesting, and the new cycle if something breaks as a result of that fix?

Multiply that by five or ten, and you will have a frustrated team bogged down by irrelevant tickets, costing the company time and money.

 

Summary of the three perspectives:

  1. a business / customer facing perspective “what does the customer need”,
  2. a technical perspective “how to implement to solve this, which architecture…”,
  3. and a QA perspective “are we building testable software, what are the acceptance criteria for each user story…”