Recent feed issues, upcoming scheduled maintenance/downtime, and how you can help

Discussion in 'Site Announcements' started by sothis, Jul 13, 2012.

  sothis

    sothis

    Posted by sothis on Jul 13, 2012
    Today, there was an issue where the feed showed an error on the profiles. Just wanted to give you a heads up about what happened, and tell you about some upcoming downtime tomorrow to maintain the servers.

    Basically, the hard drive the feed's database resides on had some issues with bad data (as it's a year old drive, is heavily used, and has started to fail), and the drive needs to be replaced. It took awhile, but we were able to fix the bad sectors and stabilize the drive. But, it does need to be replaced. This issue ONLY AFFECTED THE SITE FEED, which is stored on a separate type of database on a separate server. Your list data, all site content (reviews, screenshots, everything) are on the main server, and are not affected by this.

    Meanwhile, there's some things we need to do on the main server as well - one of the drives of the RAID array needs to be replaced, and a few other things tinkered with to keep them going. With such a heavily trafficed server, that comes with the territory.

    We've decided we'll do these all at once, since each thing would require the server to be down. We're planning it for tomorrow at some point, and I'll post updates warning people ahead of time. Updating the feed server may take extra time to get up and running after the hard drive switch, but we'll update the site code to not output errors as a temporary measure, once the server is back up, if needed.

    HUGE thanks to holms and dmitrijus who assisted for hours today with troubleshooting what could be wrong with the server. They deserve a lot of kudos.

    Finally, while I really, really dislike bringing up donations, several people in irc today asked about it, so I'll mention it. If you feel like supporting the site, here's the link to donate

    Right now times are pretty tough - the site keeps growing, traffic-wise, which means more resources (bigger and better servers, more support, etc) are needed, but the revenue coming in from ads is not growing, and it's low to begin with. The site is only pulling in a little more money than the servers and other related bills cost (I don't pay myself a dime, never have, all money goes straight into the site).

    Meanwhile, because of this crash, I just had to pay 10gen $2,500 USD for the next year, for support on the activity feed database - it's NoSQL, which is still a newish technology and there aren't really any reputable companies out there (or highly skilled server admins with that specialty). There wasn't really another choice for the feed, as all site activity is logged and to a huge degree (it's a 20 gig size for the last 8 months). The main developer I had for years is gone now, and I'm not an expert on the subject. We need ongoing support to update safely, optimize so those cursortimeout errors stop, etc. So $200/month sucks, but is pretty necessary to keep the site going.

    And because of that extra $200/month, we're now basically breaking even. That's really scary, given there's no money to do ANYTHING - nothing for emergencies that pop up, I can't hire anyone to help with some of the really complicated aspects of the site, and I can't move forward with some of the bigger projects I've wanted to do (like getting a professional redesign for cetain pages, etc). Unlike some of our competitors, we aren't corporate owned and don't have huge sponsors that can pay for multiple salaries - it's just me, and i work for free.

    So, again, if you love Anime-Planet and want to see it able to continue, please consider donating

    **note: I realize I still haven't inputted badges for a few months for donations - I apologize, it's due to a lack of time. ;_; with no developers now except me, I've found it increasingly difficult to get everything done, by myself, that's needed to support a site of 400,000 people a month. Donations help, among other things, to ensure i can actually get more people onboard to help out.

    Thanks everyone for your patience during this time, and I'll update more once it gets closer to the scheduled downtime. And thanks for using A-P :drinking:
  r18

    r18

    Posted by r18 on Jul 14, 2012
    well i can't speak for others ...but i very much appreciate the time and effort that you put into this site....and i don't feel any pressing need to have a badge dealt with when there is actual site related emergencies to be dealt take your time and don't worry i'm not going anywhere....
  Kari5

    Kari5

    Posted by Kari5 on Jul 14, 2012
    Thanks for your hard work as always, Sothis!
  NarkyOtic

    NarkyOtic

    Posted by NarkyOtic on Jul 14, 2012
    I have a buttload of awestruck respect for your efforts on this site, sothis. I don't pretend to know anything about its inner workings, but a blind man in space could see that A-P is a GARGANTUAN undertaking all of the time, especially for someone working practically alone.

    I plan on donating a decent dime when my little end of financial year bonus comes through, which will likely be early next week. I'll do more if I can.

    All the best for the shift ahead!
  sothis

    sothis

    Posted by sothis on Jul 14, 2012
    The site is now back up after many long hours and is working perfectly. The delay was due to the RAID array on the main server - not only was a drive not working, but the configuration was wacky and needed to be fixed. This led to needing 2 separate array rebuilds.

    Thanks to everyone for your patience during the maintenance - and if you're glad to see the site back up, please stop by the profiles of holms and dmitrijus and drop them a thanks. They spent all day today, again, helping out with making sure the servers are working during the maintenance, and I couldn't have done it without them.

    And if you want to help AP continue running, as unfortunately now we'll be spending a few more hundred dollars a month on server support, here's the link to donate to the cause - anything is appreciated!

    I'm off to a much, much needed dinner and adult beverage
  Drahken

    Drahken

    Posted by Drahken on Jul 14, 2012
    ^You neglected this, but oh well.

    Decidedly strange that holms goes from some random member that you were ticked off at because of his attitude & his threat to build an app, to someone who helps fix the site. Does this mean he gets his request for an api or whatever it was filled? :p
  sothis

    sothis

    Posted by sothis on Jul 15, 2012
    i posted one update (on facebook/twitter), but due to the need to get it started i started it shortly after i got up in the morning (not really possible to send warnings from bed). better that than it being down another night. i then posted judicious updates on facebook throughout the day, so i suggest people check that out anytiume theres a weird issue

    and no, it's not that simple. but yes, holms did help extremely
  Anathemus

    Anathemus

    Posted by Anathemus on Jul 19, 2012
    I might donate another small amount when i can, have you consider searching for sponsor solutions?
  About7Narwhal

    About7Narwhal

    I threw a ton of money at my screen. Did it help?
  GodzillaGus

    GodzillaGus

  hoffstyle

    hoffstyle

    Posted by hoffstyle on Aug 22, 2012
    Is there going to be a fix one day of how long it takes for personal profile pages to load?
  sothis

    sothis

    Posted by sothis on Aug 22, 2012
    i dont know what that means
  hamletsmage

    hamletsmage

    I think they're asking if there is scheduled maintanence to fix the profile pages, which are loading much slower than any other pages.
  sothis

    sothis

    Posted by sothis on Aug 22, 2012
    i know what it technically means, just saying i havent noticed the profile pages lagging
  hamletsmage

    hamletsmage

    Mine will lag every so often, or only load half the page, usually showing error messages on the friend's feed and then not loading anything below that. I know a few other people have had similar issues, but I can't remember which thread I saw that particular discussion in.

    Edit: I found the small discussion on this subject, from the beginning of August.
    Last edited: Aug 22, 2012
  Drahken

    Drahken

    Posted by Drahken on Aug 22, 2012
    See this discussion in the bugs forum:

    I have noticed an intermitent lagging problem with the user pages in relation to the feeds. The profile page will load, then hang for a bit just before where the feed is, then load the feed after a couple seconds & then load the rest of the page immediately. It's clear that when this happens, there is some kind of slowness with the feeds, which is causing the profile page to load slowly.
    Unfortunately, the inconsistancy of the behavior will make it difficult to track down. What we'd probably need to do is try to get everyone to jot down the times when they encounter the issue, then you can look at the logs for the times & see if there was heavy traffic or some server issue or whatever.
  Bamboocha

    Bamboocha

    Posted by Bamboocha on Aug 23, 2012
    For me it's pretty much all the time. My profile page have two loading speeds: slow and slower. Right now it's only slow and at least it load without mongo error. Other pages on AP load fast.

    I wish there were "Turn Feed OFF" option...
  sothis

    sothis

    Posted by sothis on Aug 23, 2012
    the feed was built to be super scalable so lag issues wouldnt happen - it looks like that was wrong. unfortunately there wont be an easy solution or fix until we get a very skilled mongo developer onboard, cause it's not me.
  hoffstyle

    hoffstyle

    Posted by hoffstyle on Aug 24, 2012
    Yeah, I am talking about how if I log in, and click on my profile link it takes 1-2 minutes to load my page. This isn't an intermittent problem happens every time.
  Drahken

    Drahken

    Posted by Drahken on Aug 24, 2012
    That's odd. For me it's very intermittant. I wonder if the speed issue is related to the frequency of activity in the given feed? My friend feed doesn't see too much activity, so that might explain why I see less of a slowdown from it.

