Blog

Helping you keep your server online and your website fast.

Contest - Win a RØDE smartLav! [Contest over]

[Update: The contest is closed; congratulations to winner Junior King!]

Server Check.in was built to help people know when their servers and websites are down (more about us). Our primary audience is small businesses, development shops, and individual website owners. Many of our customers are podcasters or publish videos online, and are always looking for new tools to help them get better recordings more easily!

Since I do a lot of recording and podcasting myself, I've written about many different microphones and recording solutions for iPhones and other smartphones and tablets, and I heartily recommend the new (and hard-to-get!) RØDE smartLav Microphone.

Rode smartLav product picture by Jeff Geerling

The smartLav is a little clip-on lavaliere mic that connects directly to your iPhone or Android phone, and lets you record lectures, podcasts, etc. with greater ease and clarity than most other solutions. It costs about $60, and is available from many different retailers.

We happen to have a brand new smartLav we'd like to give away to one lucky winner!

How to Enter

  1. Leave ONE comment below. Tell us why you'd like the smartLav.
    Use your real name and email address so we can contact you if you win!
  2. Tweet or Like this page (not a requirement, but we'd appreciate it!):
  3. Check the Server Check.in Blog on Saturday, June 1 (that's this Saturday!) in the afternoon to see who won.

Read the official contest rules at the bottom of this post.

More Info about the smartLav

And don't forget to Sign Up for Server Check.in to keep tabs on your servers and websites; it's only $15/year for peace of mind. Even if you don't sign up, read more about us, and follow us on Twitter (@servercheckin) or like us on Facebook.

Rules: No purchase necessary. We'll choose one of the comments on this post at random and contact the winner on Saturday morning. Please make sure you check your email and/or Twitter on Saturday! If the person responds to the email or Twitter message, he will become the winner. The winner will be announced on this blog and on Server Check.in's social media accounts on Saturday, June 1, in the afternoon. We will ship the smartLav, free of charge, to the winner.

Approximate value of the prize is $60 (plus shipping). Winner will be notified by email. Odds of winning depend on the number of comments on this post.

We will not use any email addresses for any purpose besides contacting the winner. Check out our privacy policy for more information about what we do with paid customers' information. We respect your privacy—especially if you sign up for our service ;-)

Moving functionality to Node.js increased per-server capacity by 100x

Node.js vs PHP and Drupal Queue Comparison - asynchronous vs synchronous

When Server Check.in was introduced, it was only slightly beyond MVP stage, and we've been iterating on it, making things more stable, adding small but effective features.

One feature that we just finished deploying is a small Node.js application that runs in tandem with Drupal to allow for an incredibly large number of servers and websites to be checked in a fraction of the time that we were checking them using only PHP, cron, and Drupal's Queue API.

The Old Way - Drupal's Queue API

The original pipeline for server checks was to do the following:

  1. On a cron run, find all servers that are due to be checked. Load them into a DrupalQueue (which we were storing in the queue table in Server Check.in's MySQL database).
  2. Build a queue worker callback, and use hook_cron_queue_info() to let Drupal manage the queue, and pass items to the worker callback whenever cron is running.
  3. The worker callback received each server in the queue, one by one, checked it (using cURL for http requests, and Ping for server pings), then posted the results back to the database.
  4. If a server went down, we would send it to another confirmation queue for a second check. If a server went back up, we would send out the proper notifications and update the server's status.

My own tests showed that this pipeline would scale to a few hundred or a thousand servers, but would quickly become a bottleneck. Therefore, we limited the number of users that were allowed to sign up for Server Check.in. We didn't want users to sign up and get a nasty surprise of their servers only getting checked every 20 or 30 minutes! (We check servers every 5-10 minutes, depending on your plan).

The major flaw with this method is that it is a synchronous, serial process: Each server check would have to wait for the previous check to complete. This means that, when checking a hundred or so servers, the queue would take up to 10-15 minutes to empty! (This is assuming the worst case; all servers are down/unresponsive and they hit a 10 second timeout).

Additionally, scaling beyond one web server for the checks would be difficult at best. The built-in Drupal Queue API can do some great things, but it's not built to run through thousands of long-running tasks in very short amounts of time. (Note that, for some processes, like those that require a lot of processing, we have used the cron queue along with drush concurrency to process things in parallel, but this is overkill for simple (and non-CPU-intense) HTTP requests, and gobbles up server memory.)

As a final note: we had looked into using curl_multi_exec() with PHP to use some amount of concurrency in my server checks, but this would've still required a bunch of additional work refactoring the Queue API and cron code. Probably more work than building something in Node, and still not as scalable!

Building a Node.js server checking application

We decided to take a look at Node.js because it's a great platform for running tons of asynchronous (parallel) tasks that aren't CPU-bound. We built out a simple Node.js app that would get a list of servers to check, then run through them in parallel, either checking for a 200 OK status, content on a page, or pinging a URL/IP address, then send the results back to my main server.

We ended up using the request library instead of Node's built-in http library, simply because it makes HTTP requests much easier to manage, especially when redirects or other craziness is involved. We also used a modified version of the ping library for server pings, and are working on contributing back our improvements have released the improved version as jjg-ping on NPM.

For connection between Node.js and the Server Check.in Drupal site/database, we built a couple simple connectors:

  • Custom JSON Callback in Drupal: To get a list of servers to check, we simply call a custom page callback that grabs the servers, marks them as 'checked out' so other servers won't check servers while they're in process, and prints a structured JSON array of server objects.
  • Send results back via CSV: In the Node app, we simply write results into a CSV file using fs.appendFile (requires Node's built-in fs library), then post that CSV file back to the main server.
  • Processing the results in Drupal: We wrote a simple drush script that bootstraps Drupal just enough to access the database/API. It grabs all the result files, processes all the results (sending notifications or adding an item to the confirmation queue as needed), and clears out the result files.

In the future, we're planning on using a central database, maybe something using Amazon RDS or another cloud-based DB, and using that to store and retrieve raw server check data. But for now, passing CSV files back to the main server, and clearing them out once processed, seems to work fine. Using CSV files in such a manner requires a little bit of extra work, but it was quick to get going—I'll iterate later :)

This move to using Node.js for asynchronous requests and pings resulted in three great benefits:

  1. We can easily scale out horizontally, with more servers to do the server checking (with more regional diversity, and with more servers as needed). Just spin up a new Node.js server, and let it do its thing.
  2. We can now process ~100 server checks per second, per Node server, whereas with PHP/Drupal alone, we was only able to do about 1 per second, on average.
  3. Node.js used about 50% less RAM than the Drupal stack that had to run for each cron queue processing run. The maximum amount of memory that was used by Node was about 20 MB. Each httpd thread on the server is around 45-50 MB.

Node for Asynchronous network or IO-bound tasks

My takeaway: If you need to do some potentially slow tasks very often, and they're either network or IO-bound, consider moving those tasks off to a Node.js app. Your server and your overloaded queue will thank you!

New Server Check.in Features - April 2013

We worked hard this month to add some great new features to Server Check.in. In addition to some tweaks to the interface to make your experience adding and managing servers much more enjoyable, we've added a new subscription plan!

  • New Premium Plan available (25 servers, 5 minute checks)
    You can now upgrade to your Premium plan, and get 25 servers (up from 5), and 5 minute check intervals (up from 10), as well as more allowed SMS messages/month. The price for the Premium plan is just $48/year (that's just $4/month!). If you've run out of additional servers in your account, you can upgrade for a pro-rated amount by logging into Server Check.in and clicking the Upgrade link under the 'Edit Account' tab.
  • New 'Global Website Latency' Tool
    We're slowly adding more tools to help you gauge your site's performance over time and against other websites. Our new Global Latency Graphs tool lets you see how the entire Internet (as seen by Server Check.in) is performing.
  • The Server Check.in Blog!
    Because we care about performance, we thought we'd start writing about how you can make your own sites and servers perform better, and stay up longer. Check out our blog and subscribe to our Blog's RSS feed to keep up to date.

We have even more exciting new features we're hoping to reveal in the coming summer months, so keep an eye on your inbox for more! As always, please let us know if you have any ideas or suggestions for improving Server Check.in by emailing us or replying to this note. Many of the features we've implemented in our short existence have been directly requested by you—our awesome customers!

New Server Check.in Features - March 2013

Here are the latest features we've added to Server Check.in since January:

  • Monthly Server Summary Emails
    One of the most requested features was a monthly summary email showing how your servers are doing at the end of a month. Server Check.in will now send you a summary email on the first day every month, showing the previous month's stats for all the servers in your account.

    This feature is enabled by default, but you can opt-out of these emails by unchecking the 'Monthly Summary Emails' checkbox in your account on Server Check.in.

  • HTTP/HTTPS Content Checks
    This is a feature we wanted since day one, and many of you have requested it; you can now have Server Check.in check for the presence of a bit of text in the markup of a web page.

    For our launch, we focused on simplicity, and made sure that everything ran well with just two kinds of checks: HTTP '200 OK' status checks, and server pings. Now that we've been able to spend more time ensuring we can handle the extra load of content checks, we've enabled this service for all your servers.

    One advanced way you can use this feature is to create a page somewhere on your site that prints a certain message if a service you have on your site is running correctly, then check for that text in Server Check.in.

We're hard at work making even more improvements to Server Check.in. What would you like to see us do? Let us know by contacting us or posting a comment below.

New Server Check.in Features - January 2013

We've been busy in the new year! Here are some new features we've rolled out in the past month:

  • Sparkline graphs of the past day's latency on your dashboard.
    Now you can simply visit https://servercheck.in/ and (when you're logged in) get a simple, quick overview of all your servers' and sites' health! There's a neat little sparkline graph that shows the server's latency over the past day.
  • Uptime summary for each server.
    If you visit one of your server's pages, you can now see uptime percentages, a count of outages, and the amount of downtime for the server in the past day and month. It's a nice way to see 'how many nines' you're getting!
  • 10-minute checks (used to be 15-minutes).
    After monitoring our servers for the first few weeks of usage, we determined that we can offer checks every ten minutes for all our customers. We continue to monitor our resource usage to make sure that we can offer you the best value for your money, while making sure our service can continue to improve as we get more users!

We'll try to keep you updated through this blog with more new features and improvements. Check back soon, and let us know in the comments what new features you'd like to see!

Server Check.in case study posted to Drupal.org

We posted a detailed case study about how Server Check.in was built on Drupal.org: Server Check.in case study.

From the case study:

Why Drupal was chosen:
Drupal provides an extremely robust and flexible platform for building websites. Server Check.in was built to be simple, inexpensive, and easy to use. More than half of the site's features were simply pieced together using modules already built by others. Missing parts—most of the backend of the service, along with payment and notification integration—were easy to add through a few custom modules.

Describe the project (goals, requirements and outcome):
Server Check.in was built using Drupal 7, and the responsive theme was built using a Zen sub-theme. Drush powers most of the backend functionality of the service (server checks, queue consumption, notification handling). Some parts of the service are provided through third party services: Stripe powers payments, and Twilio powers SMS notifications.

Read the entire case study. (Earlier, I also posted a thread with more details about Server Check.in on Hacker News).

Pages

Subscribe to Server Check.in Blog Subscribe to RSS - Server Check.in Blog