SQLAlchemy Deferred Column Loading

We have a small monitoring Flask web app using SQLAlchemy that we use to keep an eye on the status of some jobs in our processing pipeline.

Yesterday we noticed that our DB was getting nailed everytime we refreshed the main status screen, which does NOT show the stack trace (which can be VERY large for big jobs). We needed a way to only pull those fields when they were displayed, but at the same time I didn’t want to have a seperate model just to use on the main status screen. What to do?

As of SQLAlchemy 0.8, they offer something called Deferred column loading and it fit the bill nicely! Here’s what we had previously that would eager fetch everything:

from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Trace(Base):
	__tablename__ = 'log'

	id = Column(Integer, primary_key=True)
	key = Column(Text)
	created = Column(DateTime)
	failed = Column(Boolean)
	trace = Column(Text)

	def __init__(self, key, created, failed, trace):
		self.key = key
		self.created = created
		self.failed = failed
		self.trace = trace

	def __repr__(self):
		return "<Trace('{0}', '{1}', '{2}', '{3}', '{4}')>".format(self.id, self.key, self.created, self.failed, self.trace)

And here’s the updated code using deferred column loading:

from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import deferred

Base = declarative_base()

class Trace(Base):
	__tablename__ = 'log'

	id = Column(Integer, primary_key=True)
	key = Column(Text)
	created = Column(DateTime)
	failed = Column(Boolean)
	trace = deferred(Column("trace", Text))

	def __init__(self, key, created, failed, trace):
		self.key = key
		self.created = created
		self.failed = failed
		self.trace = trace

	def __repr__(self):
		return "<Trace('{0}', '{1}', '{2}', '{3}', '{4}')>".format(self.id, self.key, self.created, self.failed, self.trace)

Now, the trace column is not loaded until it is used, which is exactly what we were looking for. Nice and clean, too!

MonsterMash: Flask, ZeroMQ, and EchoNest remix

Off and on for the past couple of months I’ve been working on a side project using flask, zeromq, and the remix api by echo nest.

If you take a look online, there are a lot of excellent guides to introduce you to flask, but not many that dive into something more complex or closer to something that an engineer in more distributed services would need to put together. I’ve seen some great guides on organizing larger applications, but not so much commentary about how the experience was. This is what I want to offer to you here, just some thoughts about working with this framework for a client/server project.

MonsterMash

I am a musician, so music is a big part of my life. I’ve always been fascinated with echo nest and all the cool stuff they do with big music data, so I thought I would give some of their tools a spin. What better way to do that then to create a little web app to mash songs up?
MonsterMash logo
All the code is hosted on github here: https://github.com/karlgrz/monstermash

My little site, MonsterMash, uses an example from the echo nest remix api, afromb.py written by Ben Lacker. This set of code basically takes song A and applies segments of song B that are somewhat beat-matched to A. It does some volume enveloping calculations to even out the levels between both tracks, too, so one isn’t extremely overpowering.

If you’re not familiar with the remix api from echo nest, I encourage you to check out their tools here.

Dreamhost

I have a shared hosting account with Dreamhost that hosts this blog (and a variety of other nonsense, as well). I wanted to get flask up and running on on Dreamhost since, well, I’m cheap and didn’t want to spend any more money than I had to.

This proved to be a fun challenge that wasn’t very difficult to overcome. Dreamhost doesn’t really give you root access to anything, since it’s shared hosting. This is understandable, but it makes installing python and linux packages kind of annoying. Also, you’re normally forced to use whatever version of python is installed on the server that your account is hosted on. Weak.

Fortunately, virtualenv allows me to easily manage a self contained python environment in my local user directory. I won’t go into all the details here, use the resources out there for virtualenv if you have trouble. I found this guide to be helpful in getting started. Basically, you need to configure the subdomain that you’ll be using for the site to use passenger for python apps and to point to the correct virtualenv path. If you take a look at my passenger_wsgi.py file you’ll get the idea. This has to live at the root of the subdomain to work properly.
Passenger Checkbox in Dreamhost

Dreamhost seems to be doing more than an adequete job of handling requests, but I don’t think anything I’ve done on there has been very high traffic. I’m sure if anything high traffic was hosted there it would probably need to be moved to a self hosted solution somewhere, but that is just speculation.

EC2

I chose Amazon EC2 free tier for the server since it’s, well, free. The t1.micro instance I’m using is not recommended for anything other than this: a demonstration of technology. The instances are NOT tuned for anything production ready that requires any kind of processor power. That being said, for the purposes I was going for this worked out nicely. Free is good!

Basically, the song processing takes (a lot) longer than it would if I actually spent some money for a decent spec’d instance with some computing power for crunching the remix (c1.medium would be sufficient, I think). I’m ok with that. You should be too ;-)

Flask

I’m not going to chronicle the many steps and setups I went through to get flask up and running. I used the following guides along my journey, I think they have done a great job explaining the basics:

Overall, I really enjoyed working with flask. To an me, it just feels like how I expect a web framework to work. You get all the routing power you would expect from a web framework with VERY little ceremony. It stays out of your way so you can just write your code and get on with your day. There were very few points in time when I was actually setting up routes or other flask-specific tasks that I had to look for a great deal of information to get moving forward.

For example, a basic “about” view that returns just a static html template can look like this:

@app.route('/about')
def about():
	return render_template('about.html')

Take a look at my views.py for the brunt of the flask code.

I am enamored with SQLAlchemy, the python ORM. It’s quite literally the best ORM I’ve ever used (for better or worse…that’s up for debate as well). I love that queries are as simple as this:

user = db_session.query(User).filter(User.username==username).first()

I have not had a chance to load test this app very well yet, so I don’t have any data related to how efficient SQLAlchemy is being with it’s query generation to the server.

I also really like working with Jinja2 templates. They feel very natural to me, and there weren’t any situations that I was left wondering how to implement something. Out of the box, they just work.

Flask also feels a lot less taxing to configure and set up than django did. Now, I must caveat this statement with a few facts. My first (and only experience) with django was a couple years ago. I was not nearly as comfortable with python as I am now, and I’m certain the framework has developed more since I last looked. I will definitely be taking another look at it in the not so distant future.

What I don’t really like about what I did with flask

I might like the framework a lot, but there are things that in particular I did on this project that I’d like to improve on in the future.

  • In order to separate the views from the init module you need to have circular imports, which does NOT feel good to me. The creator explains it in the links above and that it isn’t a big deal, but it really doesn’t feel right and I wish there was a better way to do it.
  • __init__ is pretty freaking large in order to share dependencies (which I believe Blueprints would help with). I’m sure there are more refactorings I can do on that to compartmentalize it better (the links above went a little more in depth with the organization, especially Fbone…something to keep in mind in the future.)
  • I would like to further refactor the views into better organized classes, but for demonstration purposes I think this is ok. I’d also like to implement Blueprints, but I’ll save that for later. These are both just shortcomings of my own time, as they are certainly possible, I just didn’t go the extra step for this demonstration.

We’ve started using flask for a small monitoring web app in production for work and it has worked out quite well. We had a few hiccups with server configuration and had to start using gunicorn to ensure performance, but it was a pretty smooth deployment all around I think.

Also, obviously there’s no tests. I’m really not as comfortable with pyunit as I am with NUnit and I didn’t want to delay the work on this side project to learn pyunit. This is on my radar in the short future, and hopefully I can get some TDD practice with pyunit soon.

ZeroMQ

Again, I’m not going to give an introduction or setup tutorial on zeromq. There are plenty of guides online, and they do a wonderful job of that.

I’m very used to working with RabbitMQ in production for work. zeromq is definitely not RabbitMQ, and that’s ok and expected.

What I like about zeromq is that, to me, it seems geared towards simple messaging needs that do not require redundancy and durability and clustering and replication and bla bla bla…it’s simple when you need simple.

And that’s pretty refreshing. RabbitMQ is by no means a piece of bloatware, in my opinion, but it can be a bit daunting to get set up and going. The biggest problem I had with zeromq ended up being fine in the end (I was worried about building the software on Dreamhost, but it worked out just fine). I love that I can send a message like this:

import json
import zmq

context = zmq.Context()
socket = context.socket(zmq.PUSH)
socket.connect('remoteserver:5000')
socket.send_json(json.dumps([{'id':id, 'key': key}]))

And process a message like this:

import json
import zmq

context = zmq.Context()
socket = context.socket(zmq.PULL)
socket.connect('127.0.0.1:5000')
message = socket.recv_json()
obj = json.loads(message)[0]

I really did not have much in the way of a bad time with zeromq. Not much bad I can say about it. I used the PUSH/PULL methodology of distributing jobs kicked off by the web front end to the workers on my backend server. It got the job done pretty easily and quickly. I will definitely be looking to zmq in the future for quick and dirty projects that I need to get up and going quickly

That being said, it’s kind of difficult for me to trust zeromq in a large scale production app. They provide a lot of good examples and presentations on redundancy and clustering and all the features I would expect from a durable queueing implementation, but it just feels really lightweight to the point that I don’t trust it. And that’s really the only evidence I have against it. I’d love to implement a really high traffic service with it to get some numbers and see some real world usage, but I guess I’m a little afraid to use anything other than RabbitMQ for that sort of thing. One of these days perhaps I will just have to make a leap and give it a shot, because the ease of use and visibility that it allows are really refreshing.

Check this out if you’re interested in a more in depth read about zeromq and messaging ideology in general.

Remix

Remix is an SDK for interacting with echo nest’s hive mind, which allows you to programmatically “chop up” a given song into individual bars, beats, and tatums and “remix” them any way you like.

As a musician and an engineer, this technology has fascinated me for a long time. It blows my mind where we’ve come in terms of computing to allow things like this to be possible.

I messed around with trying to detect segments (i.e. verse, chorus, bridge, etc.) of both songs and using those for the mashes, but it proved to be a bit flaky with my weak first attempt. I think I will work on that in the not-so-distant future because the technology fascinates me, but the main purpose of this project was flask and zeromq and I just wanted to get it out there. Remix hacking will come later :-)

In the meantime, I thought I would share one of the more exceptional remixes I encountered in my testing of MonsterMash: Nine Inch Nails Closer and Johnny Cash covering NIN’s Hurt

Just hearing how these turned out made this whole thing worth while. It blows my mind that python code (or any code) can do these types of things now. We’ve come a long way, but we still have a long way to go. I’m excited for what’s coming next!

Please feel free to create your own mash ups on MonsterMash and let me know what you come up with!

Conclusion

I had a lot of fun learning these new technologies. It doesn’t hurt that I am using flask in production, so I benefit from having some experience with something that I will definitely be using more and more in my work.

I’ve still got a lot to learn about all of this stuff.

I really enjoyed a lot of everything I used on this project. In my experience that doesn’t happen very often. I hope the luck keeps going!

A preview of my silly little python mash up generator…

So I’ve been working on a little side project web app to get familiar with both flask (a python web microframework) and zeromq (the socket library that acts as a concurrency framework), along with remix from echonest.

There was this exchange with Brad from devmynd tonight on twitter.

Coupled with the promise of tons of rain from our friendly neighborhood meteorologist, I was inspired to create this unholy amalgamation of mash up nonsense. Behold, Milli Vanilli v. Mastodon in: Blame It On The Megalodon! Warning: your ears cannot unhear this, for better or worse.

Enjoy!

Blame It On The Megalodon

Problem with running an MVC3 project on IIS after opening and building in Visual Studio 2012

I was having trouble running a .NET 4.0, MVC3 web application locally on IIS after building in VS2012. The unit tests would pass, all projects would build successfully, and it would even run from Cassini, but something was just not working properly on IIS. This project SPECIFICALLY references MVC3 assemblies that are included in a packages folder in the project, so I didn’t think it was MVC, but all signs were pointing to that. I uninstalled MVC4, tried changing the references, but everything was still failing.

So today I started going a little slower and found the culprit.

When you open a VS2010 solution in VS2012, you are greeted with a little HTML report of things that it decided to change for you so your project / solution would play nicely in both environments. For the most part, this all seems to work quite nicely.

01

I noticed that:

  • open MVC3 project in 2012
  • converts solution and projects
  • save all
  • project is now .NET Framework 2.0 for some reason (not explicitly set before?)
  • if change in (web project) -> Properties -> Application -> Target Framework dropdown, then VS2012 mucks up my web.config pretty good, and it’s hard to see what breaks (I was just getting 500′s and it wasn’t clear why)

02

03

The big thing that jumped out at me today was that web.config was changing along with the .csproj file after I changed the framework.

04

This set off alarms in my brain, mostly “Why isn’t TargetFrameworkVersion already there?”

So I reverted everything and instead of letting VS2012 change the framework, I just opened the .csproj in Sublime Text 2 (or any text editor) and added this:

<TargetFrameworkVersion>v4.0</TargetFrameworkVersion>

After saving the .csproj file, re-opening the solution and letting VS2012 perform it’s conversion, and building, the site runs again as expected from IIS. I lost WAY too much time to that nonsense. Hopefully nobody else has to.

Top 10 Albums of 2012

These, in my opinion, are the best 10 records I heard from 2012.

It was pretty easy this year. Here we go.

10.) Torche – Harmonicraft

I didn’t listen to this record (and, to be quite honest, never listened to anything of theirs) until a couple weeks ago. I stumbled upon this one after looking through all the releases of 2012. I’m really glad I did.

These guys fit in well in my library, and Harmonicraft was an awesome record. Really good, classic rock inspired metal. I need to check out their other stuff after hearing this album.

9.) Meshuggah – Koloss

Koloss. This album broke my head the first time I listened to it. Rattle the mountains with rhythm jackhammers.

I’ve been awaiting this release since obZen, which to this day I think is still one of the heaviest records EVER. Koloss is lot more groovy than obZen was heavy, which is fine with me.

8.) Lamb of God – Resolution

Lamb of God is the workhorse of metal. You usually know what you’re getting into with them. And Resolution is no exception. That being said, they’re so polished and heavy it’s kind of hard to NOT get pulled in by their tunes.

Excellent record from an excellent band.

7.) Exotic Animal Petting Zoo – Tree of Tongues

Never heard of these guys before 2012. Then I saw the video for Thorough.Modern online and it really caught my attention. They’ve got this weird later era Poison the Well meets Glassjaw meets Dillinger Escape Plan kind of mathy vibe to them. I love it.

These guys fucking rock. Big surprise from this record. I listened to it a lot over the year. It has a meaty set of songs that are really bizarre in that good way.

There’s lots of mathy prog metal bands, and a lot of them are really good. The tinge of hardcore these guys bring to the table really sticks out to me. Really memorable songs that were stuck in my brain long after the first listen.

6.) Gojira – L’Enfant Sauvage

I never really got into these guys heavily. I think I’ve heard all their previous records, I just never dove in. I think I listened to this record 15 or 20 times right after it came out.

I was really surprised by this one. Explosia came on my speakers and that riff just blasted me in the face. Then those machine gun double bass kicks coming in. Raw.

5.) Every Time I Die – Ex-Lives

To paint you the picture of how much I was anticipating this record, I pre-ordered it, and when it came to my mailbox the day before the release date I blasted this on my singer’s PA system in my basement 3 times while drinking whiskey. Exactly as it should be.

I’ve been a ETIDiot since just after New Junk Aesthetic came out. My friend Gordon turned me on to them, and I’ve eaten up all their songs since.

This was well worth the wait. The Buckley brothers keep it killer on this batch of tunes, and it feels just like every party I’ve ever wanted to go to.

I saw them live for the first time recently, and every video I’ve seen was an understatement. They’re even better live than on record, and that seems impossible to do at their breakneck pace.

Underwater Bimbos from Outer Space, I Suck [Blood], Partying is Such Sweet Sorrow, and Business Casualty are club bangers in their own right. Great record, I can’t wait for the next one.

4.) The Mars Volta – Noctourniquet

I like The Mars Volta. I like their experimental, noisy, abstract, jammy stuff from previous recordings.

That being said, I think this is their best, most cohesive record to date.

I listened to a lot of interviews with them leading up to the release, and they talked a lot about the song writing process. They also discussed a lot of the At the Drive-In performances, and it seems like the presence of that other group really brought out the best of them on Noctourniquet.

It’s a joy to listen to. It’s easily the most accessible and song-like album The Mars Volta have released, but I think that’s ok. These songs just dig in and don’t let go. It’s a wonderful listen from start to finish, and reads just like the story they intended.

3.) Between the Buried and Me – The Parallax II: Future Sequence

I was looking forward to The Parallax II after I enjoyed the hell out of The Parallax: Hypersleep Dialogues EP that came out last year. I heard it would continue on with the same themes, and that got me excited.

I got onto their band wagon pretty late in their career, but have become a big fan. They’ve been described to me as “The Pink Floyd of Death Metal” before, and while that might not be exactly accurate, it does paint the right picture.

This record is really well produced, thought out, and performed. They nailed it. Silent Flight Parliament is perhaps one of my new favorite songs of all time. This album has it all.

2.) Deftones – Koi No Yokan

What can you say about the Deftones that hasn’t already been said? They’re one of my favorite bands and have kept up the momentum of Diamond Eyes with this record.

As always from these guys, you get a great blend of aggressive riffs, ethereal atmosphere, and lush soundscapes. Coupled with Chino’s always encompassing vocals, you get the formula for modern rock majesty.

Leathers has crept its way into my subconscious, as have Goon Squad, Rosemary, and Poltergeist.

Another classic from the Deftones. Get well, Chi.

1.) Baroness – Yellow & Green

I appreciate the way that Baroness names their records after colors. I usually associate music with colors pretty aggressively myself. It’s always been that way for me.

I also tightly couple music to seasons of the year, as well. When I first heard this record, it hit me with a blast of the feeling I get in the fall listening to new music, despite being released the day after my birthday in July. I was floored.

When Yellow & Green came out, I was just starting to really appreciate The Blue Record and it’s majesty, and had only listened to The Red Album a few times. I liked Baroness, but wasn’t a huge fan or anything.

After giving Yellow & Green a few spins, I was a convert.

The way these guys write songs hits very close to home for me. From the lyrics and themes to the guitar melodies and riffs, every song is a unique in it’s own right.

To me, both albums feel completely independent and on their own, yet both perfectly complement the other. The first disc (Yellow) feels a lot more like a classic rock era record, while the second (Green) sounds a bit more produced and experimental.

Every single song on these records speaks to me. John Baizley writes some damn good stories, and I for one can’t wait for the next batch, whenever they recover from the bus crash. Every single song on Yellow should get single/video treatment.

March to the Sea, Take My Bones Away, Eula, Cocainium, Little Things, Board Up the House, I mean, I should just put the whole record here. Every one is a standout in it’s own right. If you haven’t checked this out yet you owe it to yourself to do so as soon as possible.

Hands down my favorite album of 2012. I wore out the bits on this one.

Ubuntu Desktop 12.10 Guest Additions on Virtualbox

I had the unfortunate annoyance of trying to get Ubuntu Desktop 12.10 Guest Additions working on VirtualBox 4.24, which have a strong desire to NOT install linux kernel headers. Here’s the steps. Don’t be annoyed. I hope this finds you quickly.

I’m like you. I like my VM to scale when I change dimensions of the window, I like to copy/paste between the host/guest, and I expect that to work. Out of the box, it doesn’t. I assume this will be fixed shortly, but in the meantime, do this and you’ll be good to go.

For what it’s worth, I’m on Windows 7 64-bit Ultimate host, with the newest Ubuntu Desktop 12.10 .iso from ubuntu.com installed onto VirtualBox 4.24.r81684.

- DO NOT INSTALL GUEST ADDITIONS UNTIL THE END. You can do it now, if you want, but it’s just going to be pointless and cause you pain. If you have already installed them, uninstall them either from VBoxLinuxAdditions.run and uninstall from the mounted disc or if you installed from apt-get, just run the

sudo apt-get remove virtualbox-guest-additions 

from terminal and be done with it.

This command makes your Guest Additions install so happy it actually succeeds:

sudo apt-get install linux-headers-generic

Run that in a terminal. That’s it.

Now, shut down the VM and restart it. When it boots back up, in the VirtualBox guest window click on Devices -> Install Guest Additions and follow the prompts. This should succeed.

Shut down the VM and restart, and your VirtualBox VM should resize dynamically like you’re used to, and be able to copy/paste between host and guest and vice versa.

Something funky is going on with xserver and the way the dependencies are resolving during Ubuntu 12.10 installation. I’m sure this will get smoothed out (it is pretty recent release) but still annoying.

Enjoy.

RabbitMQ Highly Available Queues and Clustering using Amazon EC2

Using RabbitMQ on Amazon EC2 is an easy, performant way to operate a service oriented application. It’s pretty trivial to set up and once you do, you can usually forget about it and go about your day.

Until Amazon has an EC2 outage. And your bus goes down. And you don’t have a plan for getting back up quickly. Fail. Fail. Fail.

Fortunately, since version 2.6.0 (I believe…I could be wrong…) RabbitMQ has supported Highly Available queues (basically replicating queues across nodes in a cluster) to ensure that you don’t need to be choked by a single point of failure in your messaging infrastructure and can still be performant and scalable.

What I want to discuss today is setting up a RabbitMQ cluster with Highly Available queues using Amazon EC2. I’m sure you can use these techniques in a different environment, but I am tailoring all of this to a specific situation since I’m familiar with it and there doesn’t seem to be a whole lot of information pertaining to it out there.

Before we begin, I must caveat this post with a few important notes that I think are easy to overlook.

Hostnames = nodenames

It is very important that you understand the importance of the hostname for each of your instances when dealing with RabbitMQ clusters. The way that the clusters identify nodes and communicate with each other on Amazon’s (and in general as well) infrastructure is critical. RabbitMQ, by default, will use rabbit@hostname for the name of the node. It really doesn’t matter what you use for the hostname, as long as you can identify it later. For this post, let’s assume they will be ubuntu- followed by the availability zone they are in. For example, ubuntu-us-east-1a or ubuntu-us-east-1b.

Firewall rules

This might be obvious to some, but it is very important that each of your RabbitMQ nodes can communicate with one another. I think that if you are using RabbitMQ in the cloud you are aware of this, but just in case please keep it in mind. RabbitMQ communicates, by default, over port 5672. Therefore, it would be wise to assign a security group to each of these instances that allows port 5672, at least to instances within the same security group or another one you have set up. Otherwise debugging an issue will be unnecessarily difficult, and nobody wants to deal with that, right?

Updated 2012-10-25 04:20:55 UTC: Per Brett’s suggestion in the comments, which I was ignorant of, it is a great idea to open the port that epmd (Erlang Port Mapper Daemon)uses, which is the tool that which RabbitMQ relies on to identify nodes in it’s cluster. That port is 4369 by default. Once the nodes are identified, by default they communicate through pretty much any available random port. You can add the following to your rabbitmq.config to override this behavior, so you only need to open a specific port. Using Brett’s example of port 65535, the following would be added to rabbitmq.config

 
[
{kernel,
[{inet_dist_listen_min, 65535},
{inet_dist_listen_max, 65535}
]
}
]

Booting instance and installing RabbitMQ

To start, I booted up an Ubuntu Server 12.04 instance in us-east-1a availability zone. Since we are keeping in mind redundancy and geographical outages, we’re going to boot each instance in a different zone to better insulate from failure scenarios.

Please keep in mind I am using ubuntu 12.04, so your results may require a bit of deviating from what I’m doing to work in your particular environment.


cd /etc/apt/sources.list.d
sudo vim apt-rabbitmq.list
deb http://www.rabbitmq.com/debian testing main
sudo apt-get update
sudo apt-get install rabbitmq-server

This should install rabbitmq-server v. 2.8.7-1 as of the publishing of this blog. As long as you are using version 2.8.6 or greater you should be ok (they fixed some bugs introduced in v. 2.8.5 having to do with the shutting down of a mirrored queue, which is exactly what we will be focusing on).

Starting up a cluster

Next, we need to begin creating our cluster of nodes.


sudo /etc/init.d/rabbitmq-server stop (since the service gets started up on install
sudo rabbitmq-server -detached
sudo rabbitmqctl stop_app
sudo rabbitmqctl reset
sudo rabbitmqctl start_app
sudo rabbitmqctl cluster_status (should be one node running and one node in the cluster)

cluster_status Output:


Cluster status of node 'rabbit@ubuntu-us-east-1a' ...
[{nodes,[{disc,['rabbit@ubuntu-us-east-1a']}]},
{running_nodes,['rabbit@ubuntu-us-east-1a']}]
...done.

Now, we have one node running in a cluster, which right now only has itself in it. Let’s add another node to our cluster.

Spinning up another node

Spin up another instance (PREFERABLY in a completely seperate availability zone, I’m using us-east-1b, so this instance’s hostname is ubuntu-us-east-1b) and run the previous steps up until you start running rabbitmqctl commands. Instead of joining it’s own cluster, we want this new instance to join the cluster defined by the previous ubuntu-us-east-1a node.


sudo rabbitmqctl stop_app
sudo rabbitmqctl reset
sudo rabbitmqctl cluster rabbit@ubuntu-us-east-1a rabbit@ubuntu-us-east-1b (this way is disk based)
OR
sudo rabbitmqctl cluster rabbit@ubuntu-us-east-1a (this way is memory based)
sudo rabbitmqctl start_app
sudo rabbitmqctl cluster_status

Running sudo rabbitmqctl cluster_status on either instance should now show them both in the cluster and running, similar to this:


Cluster status of node 'rabbit@ubuntu-us-east-1b' ...
[{nodes,[{disc,['rabbit@ubuntu-us-east-1b','rabbit@ubuntu-us-east-1a']}]},
{running_nodes,['ubuntu-us-east-1b','rabbit@ubuntu-us-east-1a']}]
...done.

Setting up Highly Available queues

Now let’s set up an exchange and a highly available queue so we can send messages and ensure they are replicated across all our nodes.

I used python and pika, but there are NUMEROUS other clients in most languages out there. The actual nitty gritty here is outside the scope of this post, but I’m sure it shouldn’t be terribly hard to take these ideas and apply them to the language of you choosing. Run this code on the ubuntu-us-east-1a instance.


#!/usr/bin/env python

from pika.adapters import BlockingConnection
from pika import BasicProperties

connection = BlockingConnection()

channel = connection.channel()

client_params = {"x-ha-policy": "all"}

exchange_name = 'public'
queue_name = 'test_queue'
routing_key = 'test_routing_key'

channel.exchange_declare(exchange=exchange_name, type='topic')

channel.queue_declare(queue=queue_name, durable=True, arguments=client_params )

channel.queue_bind(exchange=exchange_name, queue=queue_name, routing_key=routing_key)

connection.close()

Let’s break down what we’re doing here:

We’re declaring our exchange like normal.

You see our queue_declare method has arguments=client_params. “x-ha-policy” : “all” informs rabbitmq that we want this queue to be highly available and replicated amongst our clustered nodes. This gives us the redundancy we are looking for. (source: http://www.rabbitmq.com/ha.html)

We create a binding like normal, and then we can just publish messages like normal, and rabbitmq will handle all the replication across the cluster nodes for us.

Here’s where things get fun, and a little tricky.

When catastrophe strikes…

The whole idea here is that when one node goes down the entire bus doesn’t get taken out with it. You still want your system to function.

So, let’s run a test.

With our 2 node cluster, let’s send a message to our bus cluster.


#!/usr/bin/env python

from pika.adapters import BlockingConnection
from pika import BasicProperties

connection = BlockingConnection()

channel = connection.channel()

exchange_name = 'public'
routing_key = 'test_routing_key'

channel.basic_publish(exchange=exchange_name, routing_key=routing_key, body='testing mirroring!', properties=BasicProperties(content_type="text/plain", delivery_mode=1))

print "publish complete"

connection.close()

The output from sudo rabbitmqctl list_queues on either node should look like this:


Listing queues ...
test_queue 1
...done.

This shows that exactly one message is in the ‘test_queue’ queue on both nodes, but we only published it to one node. Our replication works!

Now, kill one of the instances. That’s right. Nuke it. It’s ok. You can even go into the instance, get the PID for the rabbitmq process, and sudo kill -9 it if you like, in order to test a more disastrous situation. In fact, let’s do that. We’re going to ps aux | grep rabbitmq to get the PID for our rabbitmq process and then sudo kill -9 that PID.

DISCLAIMER: Please be sure you know what you’re doing here. Don’t go and sudo kill -9 all willy-nilly and then come back complaining about your machine being in a funky state. You’ve been warned, but if you have read this far, I’m not too worried.

If you run sudo rabbitmqctl cluster_status from the ubuntu-us-east-1b instance should fail since rabbitmq-server is no longer running. This is ok, and a part of our disaster experiment. We’ll make it better later, I promise!

But if you go to the ubuntu-us-east-1a node and sudo rabbitmqctl cluster_status, it is alive and well, and shows that the other node is just not running. Sending a message to this (ubuntu-us-east-1a) node that is still running will properly queue the message.


Cluster status of node 'rabbit@ubuntu-us-east-1a' ...
[{nodes,[{disc,['rabbit@ubuntu-us-east-1b','rabbit@ubuntu-us-east-1a']}]},
{running_nodes,['rabbit@ubuntu-us-east-1a']}]
...done.

Disaster recovery

Now, if we were to bring that bad node back into cluster, like so:


sudo rabbitmq-server -detached

And then run sudo rabbitmqctl list_queues, you will see the message has been properly replicated! No data lost!

The takeaway here is that even if there is disastrous network interruption, you can configure your client applications to use the clustered endpoints to ensure that there is a MUCH better chance of them communicating their messages to the broker.

What happens when the instance completely dies and we need to replace it?

Replacing a degraded instance is a normal operation in the cloud, but when using EC2 there are a couple of things to keep in mind. You need to be able to get the hostname for the killed instance. This is pretty simple, even if the host is long gone and you cannot access the instance metadata anymore. Just go to a healthy node and run sudo rabbitmqctl cluster_status. You should be able to deduce node that shows in the cluster but not running, and the hostname should be after the rabbit@ part of the nodename. If you don’t have ANY healthy nodes left, well…in that extreme case, I think you have more problems than I can help with!

Spin up new instance (remember, different availability zone!)

Let’s use ubuntu-us-east-1c this time. Remember, since we want to replace the ubuntu-us-east-1b node in the cluster, we need to make the new ubuntu-us-east-1c node look like the failed instance to RabbitMQ. This is how we do that:


sudo echo ubuntu-us-east-1b > /etc/hostname
sudo vim /etc/hosts
- 127.0.0.1 ubuntu-us-east-1b and remove any specific hostname redirects for old host
sudo vim /etc/rc.local
- hostname ubuntu-us-east-1b added before exit 0
# reboot instance
sudo rabbitmqctl cluster rabbit@ubuntu-us-east-1a rabbit@ubuntu-us-east-1b

The confusing part for me was associating this old hostname with the new instance. Since the cluster was created with the old name, and the running nodes have the reference to this nodename you can’t just add a new node with any nodename. The other nodes will not see the old node in the cluster list will not work correctly. This could have been fixed in a recent build, but from what I understand this procedure is important. It’s important that the hostname matches EXACTLY. This is because of the way RabbitMQ manages the cluster nodes.

As you see from running sudo rabbitmqctl list_queues from the new node, the queue data has been properly replicated to the new node!

Now this node will operate just like the old instance. It’s a little tricky and awkward, but not terribly bad.

This, of course, can all be scripted up with puppet, chef, or other admin scripts already in your environment.

Update: 2012-10-25 04:36:30 UTC: Carl pointed out that RabbitMQ inherantly does not tolerate partitioning across availability zones due to potential cluster corruption from data loss(third paragraph). This is a valid point. However, the tradeoffs between getting something operational and implemented as simply as possible and adding complexity later led me to use naive Highly Available queues and clustering only. The documentation mention some plug-ins to enable better replication over WAN, such as federation. I believe this looks to be a great addition to what I have written about here, and will definitely be looking into this in the very near future.

Give yourself more than one point of failure

Coming off of the recent spat of EC2 outages, single points of failure are hot on the mind’s of admins everywhere. If uptime is an important feature for your app (and isn’t one for EVERY app?) this is another tool for the kit that can help prevent down time in case of emergency.

Exploring Jasmine BDD Framework for Javascript

It has been AGES since I’ve put anything down on paper. I remember when I was at least writing something here (…anything) that my communication skills only improved. I can tell in the past year my skills have deteriorated a bit as I have not been required to write as much. This must be remedied, and I’ve found a great candidate that I think both is personally applicable and hopefully will benefit someone out there as well. There will hopefully be a seperate string of posts outlining my new experiences and journeys in the start-up world, but that can wait for the time being.

And let’s get at it!

I recently have been doing a lot of Javascript development at work. After speaking with some coworkers and developer friends of mine, there seems to be a an impetus to better organize Javascript source files. I know from my experience that I’ve been in situations, both professionally and in my own side-projects, where my JS files just grow and grow like a wildebeast. Other times, I have a difficult time validating that my logic has been correctly implemented. On the server side it’s usually a lot easier to validate behavior, either through proper testing or looking at values in a database or many other means. With Javascript, usually you have to fire up the browser and physically navigate through your page in order to visually validate the behavior you are expecting.

This technique stinks. I know I always forget to check something or leave something off or, worse, introduce regression bugs into my scripts. Bad, Karl.

Jasmine (http://pivotal.github.com/jasmine) appears to at least help facilitate BDD when writing Javascript. I’ve played around with it a little bit and have found it to be quite nice for designing Javascript, and hopefully will give me newfound confidence in the scripts I write.

Judge for yourself, though. I am but one developer who has had a decent introductory experience. Try the tool out for yourself and see if it helps your workflow.

For discussion sake I created a very simple spec. Let’s say we want to have an image editor, with very simple requirements.

ImageEditor
———–
- Should be able to open an image from URL
- When an image is opened, Should show controls for editing the image
- When an image is opened, Should be able to close the image
- When no image is opened, throw an exception if try to close

So, super simple spec, but a spec nonetheless. We’re given four requirements to implement in our Javascript code.

To start, I’m going to download the latest standalone release of jasmine (as of May 20th, 2011, that was 1.0.2) here (http://pivotal.github.com/jasmine/download.html). If you have a Ruby project you will be working with there is a Jasmine Gem, but to keep things language agnostic I’m not going to make any assumptions.

The zip comes with some example specs and .js files, which I basically followed along with in this post. Nothing fancy, just a “Hello, world”, which is basically what I’m showing you here.

First thing to take a look at is spec/SpecRunner.html.

When you first load the page, it runs your tests and shows you the results, either green if they all pass or red for any failures. Standard test fare. You can click “Show passed” checkbox to drill down into each individual test.

Nice and clean.

Ok, so let’s start coding some Javascript!

Our first spec is “Should be able to open an image from URL”. So let’s write an ImageSpec.js spec first:

ImageSpec.js

describe("Image", function() {
	var image;
	
	beforeEach(function() {
		image = new Image();
	});
	
	it("should be able to set an image url", function() {
		var url = "http://www.google.com/images/logo_sm.gif";
		image.load(url);
		expect(image.url).toEqual(url);
	});
});

Notice the language that Jasmine uses. The describe function is used to encapsulate a suite of tests. The it function specifies your test. This makes your test code read VERY similarly to your spec.

We’re going to add a reference to SpecRunner.html for our new ImageSpec.js file. It looks like this:

SpecRunner.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
  "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
  <title>Jasmine Test Runner</title>
  <link rel="stylesheet" type="text/css" href="lib/jasmine-1.0.2/jasmine.css">
  <script type="text/javascript" src="lib/jasmine-1.0.2/jasmine.js"></script>
  <script type="text/javascript" src="lib/jasmine-1.0.2/jasmine-html.js"></script>

  <!-- include source files here... -->

  <!-- include spec files here... -->
  <script type="text/javascript" src="spec/SpecHelper.js"></script>
  <script type="text/javascript" src="spec/ImageSpec.js"></script>

</head>
<body>

<script type="text/javascript">
  jasmine.getEnv().addReporter(new jasmine.TrivialReporter());
  jasmine.getEnv().execute();
</script>

</body>
</html>

Running this spec we should see 1 failure:

Which we do. Ok, now let’s write the code to make this pass.

Image.js

function Image() {
}

Image.prototype.load = function(url) {
	this.url = url;
}

Ok, now we have to add the following line to our SpecRunner.html in order to load our script:

SpecRunner.html

<script type="text/javascript" src="src/Image.js"></script>

And this is where my first gripe with Jasmine comes in: in a large project, I can have MANY Javascript files. This means I’m going to be editing this SpecRunner.html file A LOT while I’m developing. Weak. Apparently, the Ruby Gem eliminates this problem, and I’m willing to bet there are plugins out there that will alleviate this pain on agnostic projects. For the time being though it’s just something to keep in mind that I thought was annoying.

After we run our test now we should see green:

And it passes. Alright, let’s move onto the ImageEditor.

Our next spec states: When an image is opened, should show controls for editing the image.

Ok, let’s write a test.

ImageEditorSpec.js

describe("ImageEditor", function() {
	var imageEditor;
	var image;
	
	beforeEach(function() {
		imageEditor = new ImageEditor();
		image = new Image();
	});
	
	it("should be able to open an image from URL", function() {
		imageEditor.open(image);
		expect(imageEditor.currentImage).toEqual(image);
	});
});

Now we need to add the following line to our SpecRunner.html (somewhat annoyed with that yet? You might be…):

SpecRunner.html

<script type="text/javascript" src="spec/ImageEditorSpec.js"></script>

Run the tests, and should have a failure:

Ok, let’s pass this test.

ImageEditor.js

function ImageEditor() {
}

ImageEditor.prototype.open = function(image) {
	this.currentImage = image;
}

Now we run the test:

Still fails. D’oh! We forgot to add the ImageEditor.js file to our SpecRunner.html file! Are you really annoyed with that yet? (I know I am…)

SpecRunner.html

<script type="text/javascript" src="src/ImageEditor.js"></script>

Ok, now let’s run the test:

Green. Passed.

Ok, so we’ve opened the image. But there’s still part of the spec that isn’t implemented: should show controls for editing the image.

Ok, let’s use Jasmine’s spyOn function to monitor when functions are called.

Here’s our test:

ImageEditorSpec.js

it("should show controls when an image is opened", function() {
	spyOn(imageEditor, 'showControls');
	
	imageEditor.open(image);
	
	expect(imageEditor.showControls).toHaveBeenCalledWith(true);
});	

To me, I relate this to something similar to a Stub in Rhino.Mocks. I just want to make sure my function gets called when the image is opened. This will of course fail until we implement.

So let’s implement this behavior:

ImageEditor.js

ImageEditor.prototype.open = function(image) {
	this.currentImage = image;
	this.showControls(true);
}

ImageEditor.prototype.showControls = function(showControls) {
	this.isShowControls = showControls;
}

As you can see, the open function now calls showControls, which our test will check for. In true BDD I should write a test for showControls as well, but for the sake of this blog post I will omit that. Now we run our tests and:

Tada, green.

Ok, for our last spec, let’s use Jasmine’s expect function to test if an exception is thrown:

- When no image is opened, throw an exception if try to close

Here’s our test, where we are explicitly checking for an exception:

ImageEditorSpec.js

describe("#close", function() {
    it("should throw an exception if image editor already closed", function() {
	expect(function() {
		imageEditor.close();
	}).toThrow("image editor already closed");
    });
});

I love the way this reads. It’s pure Javascript, but to me it’s extremely straightforward and doesn’t require much in the way of picking up and going.

Running the tests should show a failure:

Which it does fail. Let’s write the code to make it pass:

ImageEditor.js

ImageEditor.prototype.close = function() {
    if(!this.currentImage) {
        throw new Error('image editor already closed');
    }
	
    this.currentImage = null;
    this.showControls(false);
}

Pass. As you can see in my code, there are a couple more functions that should be tested. Here is the spec I used for testing those:

ImageEditorSpec.js

describe("when an image is open", function() {
    beforeEach(function() {
        imageEditor.open(image);
    });
	
    it("should show the editing controls", function() {
        expect(imageEditor.isShowControls).toBeTruthy();
    });
	
    it("should be able to close the image", function() {
        imageEditor.close();
        expect(imageEditor.currentImage).toEqual(null);
    });
			
    it("should hide controls when an image is closed", function() {
        spyOn(imageEditor, 'showControls');
		
        imageEditor.close();
		
        expect(imageEditor.showControls).toHaveBeenCalledWith(false);
    });
});

And these all pass…

If this framework improves my code even a little I will be thankful. I don’t consider myself a Javascript expert by any means, but I think this tool will at least give me the confidence I need to break new ground and learn even more going forward, as well as being a lot more confident with refactorings.

In the future I hope to expand on this post with more advanced testing and implementing JQuery into my tests, as we use that library in production.

Programmatically uploading videos to YouTube using C#

Recently I spent some time investigating the Google Data .NET Client library. Specifically, I was interested in the YouTube Data API. What I wanted to do was programmatically upload a video file to my YouTube account. I ran into a couple of (minor) speed bumps along the way, and noticed there were a few things that weren’t as clear as they should have been. Hopefully I can clarify the problems I encountered, in case future developers run into the same trip ups. Let’s get started.

For the context of this post, I should explain my development environment. I am using the following:

  • Windows 7 Professional (32-bit)
  • Visual Studio 2008 Professional SP1
  • .NET 3.5 SP1
  • ASP.NET MVC 1.0

OK, so we’re going to set up a very basic ASP.NET MVC web site that will basically do two things:

  • Provide a link to use for authenticating a Google Account
  • Provide a form to direct upload a video (including metadata)

First things first: download the most up to date version of the Google Data .NET Client library and follow the instructions for installing and setting it up.

Next, ensure you have a YouTube developer API key attached to your YouTube account. If you have not done this yet, go here and associate a Developer ID with your account. Take note of this ID (it’s pretty long).

Now, let’s set up a new ASP.NET MVC project. We’re going to use the Visual Studio defaults here, and just name our project “YouTubeUploader”.

Next, we need to add some references to the Google APIs. When you install the Google Data API, there should be a solution at All Programs -> Google Data API SDK -> Google Data API SDK.sln that the setup guide tells you to open and build. Once you have done this, you can select these binaries as a reference in your current project, which is what we do here.

Next, we’re going to create a ViewModel to encapsulate all the inputs required to pass to our video uploader. This is going to be a very basic ViewModel, with nothing more than properties for retrieving inputs for our video. Here’s what the code looks like:


namespace YouTubeUploader.Models
{
public class UploadViewModel
{
public string Title { get; set; }
public string Keywords { get; set; }
public string Description { get; set; }
public bool Private { get; set; }
public string VideoTags { get; set; }
public double Latitude { get; set; }
public double Longitude { get; set; }
public string Path { get; set; }
public string Type { get; set; }
}
}

Like I said, very basic ViewModel here.

Next, we need to add a controller method to handle our Login logic. For simplicity sake, we’re going to use the HomeController for all of our methods here. In your production situation, however, this logic might be split apart into different modules. We’re just going for the basic “Hello, World” functionality here. In order to successfully make YouTube API calls (or any Google Data API, for that matter) you must retrieve an authenticated session token from the Google servers. This can be accomplished a number of different ways. Since we’re trying to make a web site here, we’re going to go with the AuthSub method of Google authentication. Here, we’re going to provide a link to our user where they can go and authenticate themselves with the Google servers, send back a session token, and then finally re-direct the user back to a page of our choosing. This token is returned as part of the request query string, which we can handle in a number of different ways. For our purposes, we are going to use a string parameter on one of our controller methods to take the parameter and use it to create an authenticated session token in memory. The method will look like this:
(NOTE: throughout the post, I reference “http://localhost:50555/” as my development server. I am just running my site through Visual Studio 2008 and am taking the default server address provided. This may vary in your environment, so please replace this address for what your environment requires.)


public ActionResult Login()
{
Session["authSubUrl"] = AuthSubUtil.getRequestUrl("http://localhost:50555/Home/Upload", "http://gdata.youtube.com", false, true);

return View();
}

What we’re doing here is using a Google utility (AuthSubUtil.getRequestUrl) to generate the text for our link to provide to our users. getRequestUrl takes the following parameters:

  • continueUrl: Where the user will be redirected after authenticating. For our example, I used my local development server (http://localhost:50555/Home/Upload) since I want to pass my authenticated session token into my Upload GET method…more on that next.
  • scope: for YouTube API calls we use http://gdata.youtube.com
  • secure: If you have registered your app with Google with the appropriate security credentials, you can set this to true to ensure that your API requests do not show the “Warning: Access Consent” verbiage after authenticating. Also, some API calls are not allowed unless your app is registered. For our testing, we send in false.
  • session: Whether the authenticated token should persist over multiple API calls or just be a “one-time-only” shot. This becomes very clear when we actually create our YouTubeRequest object.

Next we add a view for our Login page. It’s going to be a very generic view, with only one link on the page. Here is the whole view:


<%@ Page Title="" Language="C#" MasterPageFile="~/Views/Shared/Site.Master" Inherits="System.Web.Mvc.ViewPage" %>
<asp:Content ID="Content1" ContentPlaceHolderID="TitleContent" runat="server">
Login
</asp:Content>
<asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server">
<h2>Login</h2>
<a href="<%= Session["authSubUrl"] %>">Click here to login</a>
</asp:Content>

Notice how we are retrieving the URL text from Session["authSubUrl"], which we set in our Login() method. You could just as easily encapsulate this value into a ViewModel, however, I felt for the type of exercise we’re performing here, this was sufficient.

Let’s compile our project now and run our website. What you see when you navigate to http://localhost:/Home/Login is similar to the following:

The link brings us to a very familiar page to anyone with a Google account:

Once the user has entered their credentials, the following screen shows up:

This is the warning I mentioned previously about a secure application. If you secure your site with Google, the verbiage here (according to the documentation) is omitted. I have not yet secured a site with Google yet, so I have not experienced this difference.

After clicking on “Allow Access”, we’re presented with the following screen:

D’oh! We don’t have an Upload view or controller method yet to handle this redirect! This is what we will create next. Take a look at the URL that Google navigated to post-login. http://localhost:50555/Home/Upload?token=CPvdxbuhGRDovLiXBw That looks awfully similar to what we specified in our Login() method, doesn’t it? And you can see the authenticated token in the QueryString at the end of our URL.

Next we add a controller method to handle GET requests to our Upload page. This is where we are going to handle binding our session token into a YouTubeRequestSettings object, and we’ll use that to build a YouTubeRequest object, which is how we’ll interact with the YouTube Data API. The method looks like this:


public ActionResult Upload(string token)
{
Session["token"] = AuthSubUtil.exchangeForSessionToken(token, null);

return View();
}

Ok, what we’re doing here is handling the QueryString token we get back from Google as a part of the GET request by making sure our method has a string parameter (which we call token). The method then uses a method on AuthSubUtil called exchangeForSessionToken which takes a string and an AsymmetricAlgorithm and returns a token good for an entire user session. This way we only have to authenticate the user once per session and they can make as many API calls as the system allows. Since we are not using a secured certificate for authentication we are leaving this as a null parameter. However, if you choose to use this functionality in a production environment I highly suggest taking a look at the documentation on registering your app with Google to take advantage of the heightened security. As this is a simple exercise, we are omitting this.

Next we add a strongly typed view (Create) for our Upload logic (UploadViewModel). We are going to choose “Create” template from the dropdown, and our view comes out like so:


<%@ Page Title="" Language="C#" MasterPageFile="~/Views/Shared/Site.Master" Inherits="System.Web.Mvc.ViewPage<YouTubeUploader.Models.UploadViewModel>" %>
<asp:Content ID="Content1" ContentPlaceHolderID="TitleContent" runat="server">
Upload
</asp:Content>
<asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server">
<h2>Upload</h2>
<%= Html.ValidationSummary("Create was unsuccessful. Please correct the errors and try again.") %>
<% using (Html.BeginForm()) {%>
<fieldset>
<legend>Fields</legend>
<p>
<label for="Title">Title:</label>
<%= Html.TextBox("Title") %>
<%= Html.ValidationMessage("Title", "*") %>
</p>
<p>
<label for="Keywords">Keywords:</label>
<%= Html.TextBox("Keywords") %>
<%= Html.ValidationMessage("Keywords", "*") %>
</p>
<p>
<label for="Description">Description:</label>
<%= Html.TextBox("Description") %>
<%= Html.ValidationMessage("Description", "*") %>
</p>
<p>
<label for="Private">Private:</label>
<%= Html.TextBox("Private") %>
<%= Html.ValidationMessage("Private", "*") %>
</p>
<p>
<label for="VideoTags">VideoTags:</label>
<%= Html.TextBox("VideoTags") %>
<%= Html.ValidationMessage("VideoTags", "*") %>
</p>
<p>
<label for="Latitude">Latitude:</label>
<%= Html.TextBox("Latitude") %>
<%= Html.ValidationMessage("Latitude", "*") %>
</p>
<p>
<label for="Longitude">Longitude:</label>
<%= Html.TextBox("Longitude") %>
<%= Html.ValidationMessage("Longitude", "*") %>
</p>
<p>
<label for="Path">Path:</label>
<%= Html.TextBox("Path") %>
<%= Html.ValidationMessage("Path", "*") %>
</p>
<p>
<label for="Type">Type:</label>
<%= Html.TextBox("Type") %>
<%= Html.ValidationMessage("Type", "*") %>
</p>
<p>
<input type="submit" value="Create" />
</p>
</fieldset>
<% } %>
<div>
<%=Html.ActionLink("Back to List", "Index") %>
</div>
</asp:Content>

This is a very limited view. What we are doing is adding a field for every property on our UpdateViewModel. This allows the user to specify what kind of video they want to upload.

Next we add a controller method to handle the POST request for our Upload page (i.e. what happens when we click “Create”). This is where the bulk of our logic will reside. Here’s what the code looks like:


[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Upload(UploadViewModel uploadViewModel)
{
const string developerKey = "THIS_IS_WHERE_YOUR_REALLY_LONG_DEVELOPER_API_KEY_GOES";
const string applicationName = "THIS_IS_WHERE_YOUR_APP_NAME_GOES";

_settings = new YouTubeRequestSettings(applicationName, "ThisCanSeriouslyBeAnyString_It'sBeenDeprecated", developerKey, (string) Session["token"]);
_request = new YouTubeRequest(_settings);

var newVideo = new Video();

newVideo.Title = uploadViewModel.Title;
newVideo.Keywords = uploadViewModel.Keywords;
newVideo.Description = uploadViewModel.Description;
newVideo.YouTubeEntry.Private = uploadViewModel.Private;

newVideo.YouTubeEntry.Location = new GeoRssWhere(uploadViewModel.Latitude, uploadViewModel.Longitude);

newVideo.Tags.Add(new MediaCategory(uploadViewModel.VideoTags, YouTubeNameTable.DeveloperTagSchema));

newVideo.YouTubeEntry.MediaSource = new MediaFileSource(uploadViewModel.Path, uploadViewModel.Type);
var createdVideo = _request.Upload(newVideo);

return View();
}

Ok, so it was this method where I ran into the gotcha’s that prompted me to write this post in the first place. Once again, in a production environment, you will probably have the developerKey and applicationName stored in some kind of configuration file / object or a database. For our example, we’re just setting some hard-coded strings inside our method. These are used to create our YouTubeRequestSettings object. As you can see, the method takes 4 parameters, and this is the method call that was a pain to debug. The 4 parameters are:

  • applicationName: The name of our application, as specified in our YouTube Account screen, to the left of our developer api key.
  • client: If you look on your YouTube account screen (as of February 24th, 2010) you’ll notice there is not a client id on your screen. In fact, there is verbiage stating that they are no long required. Use any string you want here. Anything. I used “ThisIsMyRidiculouslyLongClientIdStringThatWillWorkJustBecause” and that is fine. It can be anything. I don’t know why this hasn’t been deprecated yet, but hopefully in the future it does to reduce confusion.
  • developerKey: This is your developer key from your YouTube account page. It’s really long, so be sure when you copy / paste it in that you grabbed everything.
  • authSubToken: This is the string version of the AuthSub session token we created in our Login() method.

Once you understand the functionality in setting up your YouTubeRequestSettings object the rest is a walk in the park. The YouTubeRequest object itself takes a YouTubeRequestSettings object as a parameter, so you just new() up one of those with the YouTubeRequestSettings object we just created. Then, we create a new Video() object and set the properties on it equal to the values in our UploadViewModel. This is an ideal situation for AutoMapper in that all we’re doing is basically mapping properties from one object to another. However, for this example we are just going to set them explicity ourselves. Then we create a new MediaFileSource object as a property on our Video object. Be sure to escape ‘\’ in your path, if you are using a local path (i.e. instead of C:\MyCode\Project you need C:\\MyCode\\Project). Also, for the Type property, you need the MIME type of the video you are uploading. For example, for Windows Media Video files, you want to use “video/x-ms-wmv” as your type.

And that’s it! Let’s run the web site now and see our results.

To make this more robust (and actually usable) you’ll want to provide some kind of feedback mechanism to notify your user whether the upload failed or was successful. For this example I chose to just prove how to upload the files.

I hope this eases someone’s pain and eliminates the 45 minutes – 1 hour I lost trying to figure out why my API calls weren’t being correctly authenticated. Take some time and experiment with the rest of the APIs, which allows you to do pretty much anything you can do on the web site.