Build web analytics tag scanning and monitoring tool.
Application provides ability for the site owners to scan their web-sites and gather such information:
- Number of pages
- Presentation of tags (Google Analytics, Yahoo, and any others)
- Number of broken links
- Detailed list of broken links and their location on site.
Client side is built on Ruby on Rails framework, while for Processing part Java based frameworks (Solr and Nutch) are used. Their communications is built through API.
The architecture was designed in a scalable way,where the application can process sites with hundred thousands of pages and even more. Our team developed the application from scratch, both RoR and Java sides. This application is a good example of how two technologies can be combined in order to achieve best results in short terms.
Most interesting parts in the application from engineering point of view are:
- API based communication between RoR and Java sides
- Subscription based functionality with Paypal payment
- Configurable, scalable and stable Processing engine on the basis of Nutch and Solr
- Scaling of application in order to process simultaneously big number of sites.