heart cooks brain
Search
Calendar
March 2010
S M T W T F S
« Feb    
 123456
78910111213
14151617181920
21222324252627
28293031  
Categories
Tag Cloud
Subscribe
Feed Icon
HCB Weblog Feeds

[RSS]
[Atom]
[XML]
Heart Cooks Brain Weblog
Archive View

Heart Cooks Brain has recently incorporated many changes to our service monitoring capabilities!

Traditionally our server monitoring has been handled in-house and duties were distributed among the Heart Cooks Brain server cluster through an elaborate system of checks and balances where our servers continuously monitored one another and were able to take action and recover from most problems automatically or by alerting a technician if automatic recovery was not possible. This system proved largely effective but ultimately the test of time brought to light inherit weakness in this implementation. As an example, last month one of our parent data centers experienced an outage which affected two servers within the Heart Cooks Brain cluster. This outage was brief but in this particular scenario set-off a chain of events which prevented the servers from recovering automatically and synchronizing with the rest of our server cluster. As a result alerts were not sent out to technicians promptly and some clients experienced intermittent service disruptions as a result.

Since this event occurred we have been investigating new ways to prevent this any other scenarios with the potential for similar outcomes from happening again. Today we are proud to announce our enhanced monitoring capabilities! Our services are now under the watchful eye of Pingdom Systems. Their services add an extra layer of reliability by building upon our preexisting capabilities and, more importantly, introducing independent monitoring systems. Pingdom utilizes a geographically distributed and independent network of probes to provide uptime monitoring and performance analysis of services. Additionally, their probes provide deeper analysis and broader capabilities then previously available to us. Traditionally probes operate by checking for connectivity to services, if connectivity exists then the probing is considered a success. The reality is that while lack of connectively is an unmistakable sign of a disruption it isn’t the only sign. Pingdom’s probes interact with key services beyond simply checking for connectivity, they exchange data with and analyze the data returned from services. This implementation provides a much more accurate analysis of the health and integrity of monitored services.

Privacy Policy | Agreements | Contact Us | © 2010 Heart Cooks Brain