Tag - socorro

Entries feed - Comments feed

Wednesday, 10 June 2015

Rethinking Socorro's Web App

rewrite-cycle.jpg
Credits @lxt

I have been thinking a lot about what we could do better with Socorro's webapp in the last months (and even more, the first discussions I had about this with phrawzty date from Spring last year). Recently, in a meeting with Lonnen (my manager), I said "this is what I would do if I were to rebuild Socorro's webapp from scratch today". In this post I want to write down what I said and elaborate it, in the hope that it will serve as a starting point for upcoming discussions with my colleagues.

State of the Art

First, let's take a look at the current state of the webapp. According to our analytics, there are 5 parts of the app that are heavily consulted, and a bunch of other less used pages. The core features of Socorro's front-end are:

Those we know people are looking at a lot. Then there are other pages, like Crashes per User, Top Changers, Explosive Crashes, GC Crashes and so on that are used from "a lot less" to "almost never". And finally there's the public API, on which we don't have much analytics, but which we know is being used for many different things (for example: Spectateur, crash-stats-api-magic, Are we shutting down yet?, Such Comments).

The next important thing to take into account is that our users oftentimes ask us for some specific dataset or report. Those are useful at a point in time for a few people, but will soon become useless to anyone. We used to try and build such reports into the webapp (and I suppose the ones from above that are not used anymore fall into that category), but that costs us time to build and time to maintain. And that also means that the report will have to be built by someone from the Socorro team who has time for it, it will go through review and testing, and by the time it hits our production site it might not be so useful anymore. We have all been working on trying to reduce that "time to production", which resulted in the public API and Super Search. And I'm quite sure we can do even better.

Building Reports

bob-the-builder.jpg

Every report is, at its core, a query of one or several API endpoints, some logic applied to the data from the API, and a view generated from that data. Some reports require very specific data, asking for dedicated API endpoints, but most of them could be done using either Super Search alone or some combination of it with other API endpoints. So maybe we could facilitate the creation of such reports?

Let us put aside the authentication and ACL features, the API, the admin panel, and a few very specific features of the web app, to focus on the user-facing features. Those can be simply considered as a collection of reports: they all call one or several models, have a controller that does some logic, and then are displayed via a Django template. I think what we want to give our users is a way to easily build their own reports. I would like them to be able to answer their needs as fast as possible, without depending on the Socorro team.

The basic brick of a fresh web app would thus be a report builder. It would be split in 3 parts:

  • the model controls the data that should be fetched from the API;
  • the controller gets that data and performs logic on it, transforming it to fit the needs of the user;
  • and the view will take the transformed data and turn it into something pretty, like a table or a graph.

Each report could be saved, bookmarked, shared with others, forked, modified, and so on. Spectateur is a prototype of such a report builder.

We developers of Socorro would use that report system to build the core features of the app (top crashers, home page graphs, etc. ), maybe with some privileges. And then users will be able to build reports for their own use or to share with teammates. We know that users have different needs depending on what they are working on (someone working on FirefoxOS will not look at the same reports than someone working on Thunderbird), so this would be one step towards allowing them to customize their Socorro.

One Dashboard to Rule Them All

So users can build their own reports. Now what if we pushed customization even further? Each report has a view part, and that's what would be of interest to people most of the time. Maybe we could make it easy for a user to quickly see the main reports that are of interest to them? My second proposal would be to build a dashboard system, which would show the views of various reports on a single page.

A dashboard is a collection of reports. It is possible to remove or add new reports to a dashboard, and to move them around. A user can also create several dashboards: for example, one for Firefox Nightly, one for Thunderbird, one for an ongoing investigation... Dashboards only show the view part of a report, with links to inspect it further or modify it.

dashboard-example.png

An example of what a dashboard could look like.

Socorro As A Platform

The overall idea of this new Socorro is to make it a platform where people can find what they want very quickly, personalize their tool, and build whatever feature they need that does not exist yet. I would like it to be a better tool for our users, to help them be even more efficient crash killers.

I can see several advantages to such a platform:

  • time to create new reports is shorter;
  • people can collaborate on reports;
  • users can tweak existing reports to better fit their needs;
  • people can customize the entire app to be focused on what they want;
  • when you give data to people, they build things that you did not even dream about. I expect that will happen on Socorro, and people will come up with incredibly useful reports.

I Need Feedback

feedback-everywhere.jpg

Concretely, the plan would be to build a brand new app along the existing one. The goal won't be to replace it right away, but instead to build the tools that would then be used to replace what we currently have. We would keep both web apps side by side for a time, continuing to fix bugs in the Django app, but investing all development time in the new app. And we would slowly push users towards the new one, probably by removing features from the Django app once the equivalent is ready.

I would love to discuss this with anyone interested. The upcoming all-hands meeting in Whistler is probably going to be the perfect occasion to have a beer and share opinions, but other options would be fine (email, IRC... ). Let me know what you think!

Thursday, 26 February 2015

Spectateur, custom reports for crash-stats

The users of Socorro at Mozilla, the Stability team, have very specific needs that vary over time. They need specific reports for the data we have, new aggregations or views with some special set of parameters. What we developers of Socorro used to do was to build those reports for them. It's a long process that usually requires adding something to our database's schema, adding a middleware endpoint and creating a new page in our webapp. All those steps take a long time, and sometimes we understand the needs incorrectly, so it takes even longer. Not the best way to invest our time.

Nowadays, we have Super Search, a flexible interface to our data, that allows users to do a lot of those specific things they need. As it is highly configurable, it's easy to keep the pace of new additions to the crash reports and to evolve the capabilities of this tool. Couple that with our public API and we can say that our users have pretty good tools to solve most of their problems. If Super Search's UI is not good enough, they can write a script that they run locally, hitting our API, and they can do pretty much anything we can do.

But that still has problems. Local scripts are not ideal: it's inconvenient to share them or to expose their results, it's hard to work on them collaboratively, it requires working on some rendering and querying the API where one could just focus on processing the data, and it doesn't integrate with our Web site. I think we can do better. And to demonstrate that, I built a prototype. Introducing...

Spectateur

spectateur.jpg Spectateur is a service that takes care of querying the API and rendering the data for you. All you need to do is work on the data, make it what you want it to be, and share your custom report with the rest of the world. It uses a language commonly known, JavaScript, so that most people (at least at Mozilla) can understand and hack what you have done. It lets you easily save your report and gives you a URL to bookmark and to share. And that's about it, because it's just a prototype, but it's still pretty cool, isn't it?

To explain it a little more: Spectateur contains three parts. The Model lets you choose what data you want. It uses Super Search and gives you about the same capabilities that Socorro's UI has. Once you have set your filters and chosen the aggregations you need, we move to the Controller. That's a simple JavaScript editor (using Ace) and you can type almost anything in there. Just keep the function transform, the callback and the last lines that set the interface, otherwise it won't work at all. There are also some limitations for security: the code is executed in a Web Worker in an iframe, so you have no access to the main page's scope. Network requests are blocked, among other things. I'm using a wonderful library called jailed, if you want to know more, please read its documentation.

Once you are done writing your controller, and you have exposed your data, you can click the Run button to create the View. It will fetch the data, run your processor on that data and then render the results following the rules you have exposed. The data can currently be displayed as a table (using jsGrid) or as a chart (using Chart.js). For details, please read the documentation of Spectateur (there's a link at the top). When you are satisfied with your custom report, click the button Save. That will save the Model and the Controller and give you a URL (by updating the URL bar). Come back to that URL to reload your report. Note that if you make a change to your report and click Save again, a new URL will be generated, the previous report won't be overwritten.

As an example, here is a report that shows, for our B2G product, a graph of the top versions, a chart of the top signatures and a list of crash reports, all of that based on data from the last 7 days: https://spectateur.mozilla.io/#58a036ec-c5bf-469a-9b23-d0431b67f436

I hope this tool will be useful to our users. As usual, if you have comments, feedback, criticisms, if you feel this is a waste of time and we should not invest any more time in it, or on the contrary you think this is what you needed this whole time, please please please let us know!

Tuesday, 2 December 2014

Socorro: the Super Search Fields guide

Socorro has a master list of fields, called the Super Search Fields, that controls several parts of the application: Super Search and its derivatives (Signature report, Your crash reports... ), available columns in report/list/, and exposed fields in the public API. Fields contained in that list are known to the application, and have a set of attributes that define the behavior of the app regarding each of those fields. An explanation of those attributes can be found in our documentation.

In this guide, I will show you how to use the administration tool we built to manage that list.

You need to be a superuser to be able to use this administration tool.

Understanding the effects of this list

It is important to fully understand the effects of adding, removing or editing a field in this Super Search Fields tool.

A field needs to have a unique Name, and a unique combination of Namespace and Name in database. Those are the only mandatory values for a field. Thus, if a field does not define any other attribute and keeps their default values, it won't have any impact in the application -- it will merely be "known", that's all.

Now, here are the important attributes and their effects:

  • Is exposed - if this value is checked, the field will be accessible in Super Search as a filter.
  • Is returned - if this value is checked, the field will be accessible in Super Search as a facet / aggregation. It will also be available as a column in Super Search and report/list/, and it will be returned in the public API.
  • Permissions needed - permissions listed in this attribute will be required for a user to be able to use or see this field.
  • Storage mapping - this value will be used when creating the mapping to use in Elasticsearch. It changes the way the field is stored. You can use this value to define some special rules for a field, for example if it needs a specific analyzer. This is a sensitive attribute, if you don't know what to do with it, leave it empty and Elasticsearch will guess what the best mapping is for that field.

It is, as always, a rule of thumb to apply changes to the dev/staging environments before doing so in production. And to my Mozilla colleagues: this is mandatory! Please always apply any change to stage first, verify it works as you want (using Super Search for example), then apply it to production and verify there.

Getting there

To get to the Super Search Fields admin tool, you first need to be logged in as a superuser. Once that's done, you will see a link to the administration in the bottom-right corner of the page.

Fig 1 - Admin link

Clicking that link will get you to the admin home page, where you will find a link to the Super Search Fields page.

Fig 2 - Admin home page

The Super Search Fields page lists all the currently known fields with their attributes.

Fig 3 - Super Search Fields page

Adding a new field

On the Super Search Fields page, click the Create a new field link in the top-right corner. That leads you to a form.

Fig 4 - New field button

Fill all the inputs with the values you need. Note that Name is a unique identifier to this field, but also the name that will be displayed in Super Search. It doesn't have to be the same as Name in database. The current convention is to use the database name but in lower case and with underscores. So for example if your field is named DOMIPCEnabled in the database, we would make the Name something like dom_ipc_enabled.

Use the documentation about our attributes to understand how to fill that form.

Fig 5 - Example data for the new field form

Clicking the Create button might take some time, especially if you filled the Storage mapping attribute. If you did, in the back-end the application will perform a few things to very that this change does not break Elasticsearch indexing. If you get redirected to the Super Search Fields page, that means the operation was successful. Otherwise, an error will be displayed and you will need to press the Back button of your browser and fix the form data.

Note that cache is refreshed whenever you make a change to the list, so you can verify your changes right away by looking at the list.

Editing a field

Find the field you want to edit in the Super Search Fields list, and click the edit icon to the right of that field's row. That will lead you to a form much like the New field one, but prefilled with the current attributes' values of that field. Make the appropriate changes you need, and press the Update button. What applies to the New field form does apply here as well (mapping checks, cache refreshing, etc. ).

Fig 6 - The edit icon

Deleting a field

Find the field you want to edit in the Super Search Fields list, and click the delete icon to the right of that field's row. You will be prompted to confirm your intention. If you are sure about what you're doing, then confirm and you will be done.

Fig 7 - The delete icon

The missing fields tool

We have a tool that looks at all the fields known by Elasticsearch (meaning that Elasticsearch has received at least one document containing that field) and all the fields known in the Super Search Fields, and shows a diff of those. It is a good way to see if you did not forget some key fields that you could use in the app.

To access that list, click the See the list of missing fields link just above the Super Search Fields list.

Fig 8 - The missing fields link

The list of missing fields provides a direct link to create the field for each row. It will take you to the New field form with some prefilled values.

Fig 9 - Missing fields page

Conclusion

I think I have covered it all. If not, let me know and I'll adjust this guide. Same goes if you think some things are unclear or poorly explained.

If you find bugs in this Super Search Fields tool, please use Bugzilla to report them. And remember, Socorro is free / "libre" software, so you can also go ahead and fix the bugs yourself! :-)

Tuesday, 1 July 2014

About Socorro

I have been working on Socorro for about 3 years now. It was about time that I started talking about it on this blog! With this first post, I am inaugurating 2 new tags, socorro and mozilla. I'm also going to add a language tag to my posts, so each new post will either be en or fr. This way, you can choose more accurately what kind of content you want to read from this blog. The previous links on the tags lead to the filtered Atom feeds, and I will add the mozilla feed to the Planet.

So, what is Socorro?

  1. a planet in the Star Wars universe
  2. a city in Brazil
  3. an island in Mexico
  4. a server to process breakpad crash reports

All of those are true, but as you guessed (I hope! ) we will only consider the last option. So basically, Socorro is a software that collects, processes and stores crash reports generated by breakpad. We use breakpad in our desktop software such as Firefox, Firefox for Android, Thunderbird and FirefoxOS. We then have a Web interface that shows information about the stability our those software, and we call it crash-stats. As usual with Mozilla, most of the information on this site is public, the only exceptions being sensitive data like email addresses, URLs, etc.

Most of Socorro's code base in written in Python. The back-end is raw Python based on a tool called configman, developed by my colleagues Lars and Peter, that handles the configuration and is used as a dependency injection system. It allows Socorro to be very modular, and to add or remove components easily without changing any code. We use 3 different databases:

  1. HBase is the primary storage and is there for data consistency (it contains all the data we receive and process and must never lose data)
  2. PostgreSQL is used as the main source of data for the UI (it contains all the processed data and has a lot of materialized views with various computed information)
  3. elasticsearch is used as the secondary source of data for the UI, mainly for everything related to search. This is what I have been mainly working on.

The Web app is based on Playdoh, Mozilla's customized django framework. We have more and more javascript, mostly using jQuery, but it's a bit chaotic at the moment.

Interested in contributing?

If Socorro sounds like a good project to you and you would like to participate, we would be very happy to hear about you! There are a few places where you can start:

You can also join us in Mozilla's IRC server, channel #breakpad. There are a good variety of bugs to fix on Socorro, from pure front-end work to complex back-end Python problems. And if Socorro is not your thing but you still want to help Mozilla fulfill its mission, you can take a look at whatcanidoformozilla.org!