running a WebDav GET request against YQL
This builds off my previous post. Suppose you’ve got content in YQL that you’d like to GET (ha!) out. The table is super simple. Ok, this is really just an unexciting GET request to YQL, but it’s cool because we’re starting to think of YQL as a file store accessible via WebDAV methods.
Prerequisites
- A sherpa record w/ this value in it: {“file1″:”content 1″, “file2″, “content 2″}. The keys and values w/in the JSON can be anything. If you haven’t worked w/ YQL storage before, check out the documentation.
- The code below edited to use your storage record’s select address
Flow
- You make a GET request to YQL w/ a query param path set the the value of one of your keys in the JSON object described above, eg path=’file1′
- YQL retrieves the storage record, converts it to JSON, and returns the value associated w/ the path you sent
Code
<?xml version="1.0" encoding="UTF-8"?>
<table xmlns="http://query.yahooapis.com/v1/schema/table.xsd">
<meta>
<author>Erik Eldridge</author>
<description>
</description>
<sampleQuery></sampleQuery>
</meta>
<bindings>
<select produces="XML">
<inputs>
<key id="path" type="xs:string" paramType="variable"/>
</inputs>
<execute><![CDATA[
response.object = function () {
//fetch 'files'
var query = 'select * from yql.storage where name="store://fOUBAHrNTP9vFVB2k8E2jEE"',
results = y.xmlToJson(y.query(query).results);
return results.results.result.value[path];
}();
]]></execute>
</select>
</bindings>
</table>
Notes
- If you wanted to use a WebDAV client w/ this output, you could run something like this Ruby code in a Rack app, and point your WebDAV client at it:
file = 'file1' query = "use 'http://example.com/get.xml' as table; select * from table where path='#{file}'" host = 'http://query.yahooapis.com' path = '/v1/public/yql' q = Rack::Utils.escape(query) # setting debug to true turns off YQL's caching, which is good when testing uri = "#{host}#{path}?q=#{q}&debug=true" res = Net::HTTP.get_response( URI.parse(uri) ) doc = REXML::Document.new(res.body) # extract the 'results' element result = REXML::XPath.first( doc, "//results" ) # return the flattened xml [200, {"Content-Type" => "application/xml"}, '<?xml version="1.0" encoding="utf-8"?>' + result.elements[1].to_s]
generating webdav propfind xml from yql
E4X support makes YQL is a great XML-generation engine. Here’s some code to create the response xml for a WebDAV PROPFIND request for a directory called webdav containing an empty file called foo.txt.
Note: to initially get a handle on what XML WebDAV outputs, I turned on WebDAV support in apache and made a curl request to it like this:
curl -X PROPFIND –header “Depth:1″ {user}:{pass}@{your ip address}/webdav/
You can run the code below in the YQL console.
<?xml version="1.0" encoding="UTF-8"?>
<table xmlns="http://query.yahooapis.com/v1/schema/table.xsd">
<meta>
<author>Erik Eldridge</author>
<description>
</description>
<sampleQuery></sampleQuery>
</meta>
<bindings>
<select produces="XML">
<inputs>
<key id="method" type="xs:string" paramType="variable"/>
<key id="path" type="xs:string" paramType="variable"/>
</inputs>
<execute><![CDATA[
response.object = function () {
var xml = <D:multistatus xmlns:D="DAV:">
<D:response xmlns:lp1="DAV:" xmlns:lp2="http://apache.org/dav/props/">
<D:href>/webdav/</D:href>
<D:propstat>
<D:prop>
<lp1:resourcetype>
<D:collection/>
</lp1:resourcetype>
<lp1:creationdate>2010-01-02T19:43:01Z</lp1:creationdate>
<lp1:getlastmodified>Sat, 02 Jan 2010 19:43:01 GMT</lp1:getlastmodified>
<lp1:getetag>"19013d-1000-b2283b40"</lp1:getetag>
<D:supportedlock>
<D:lockentry>
<D:lockscope>
<D:exclusive/>
</D:lockscope>
<D:locktype>
<D:write/>
</D:locktype>
</D:lockentry>
<D:lockentry>
<D:lockscope>
<D:shared/>
</D:lockscope>
<D:locktype>
<D:write/>
</D:locktype>
</D:lockentry>
</D:supportedlock>
<D:lockdiscovery/>
<D:getcontenttype>httpd/unix-directory</D:getcontenttype>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</D:response>
<D:response xmlns:lp1="DAV:" xmlns:lp2="http://apache.org/dav/props/">
<D:href>/webdav/foo.txt</D:href>
<D:propstat>
<D:prop>
<lp1:resourcetype/>
<lp1:creationdate>2010-01-02T19:43:01Z</lp1:creationdate>
<lp1:getcontentlength>0</lp1:getcontentlength>
<lp1:getlastmodified>Sat, 02 Jan 2010 19:43:01 GMT</lp1:getlastmodified>
<lp1:getetag>"19013f-0-b2283b40"</lp1:getetag>
<lp2:executable>F</lp2:executable>
<D:supportedlock>
<D:lockentry>
<D:lockscope>
<D:exclusive/>
</D:lockscope>
<D:locktype>
<D:write/>
</D:locktype>
</D:lockentry>
<D:lockentry>
<D:lockscope>
<D:shared/>
</D:lockscope>
<D:locktype>
<D:write/>
</D:locktype>
</D:lockentry>
</D:supportedlock>
<D:lockdiscovery/>
<D:getcontenttype>text/plain</D:getcontenttype>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</D:response>
</D:multistatus>;
return xml;
}();
]]></execute>
</select>
</bindings>
</table>
Building and running the Mochiweb dev server
the Internet as a series of tubes
We’ve long sought to create a singular artificial intelligence. I wonder if another aspect of intelligence arises simply from the existence of connections. Our brain is not composed of intelligence, but rather a mass of connections, neural pathways, which somehow creates an opportunity for intelligence to occur. The Internet currently seems to be beginning to manifest intelligent behavior in that I can interact with it and gain something from that interaction (“You just checked into restaurant X. Your friend Alan is here too”). I wonder if it is our input then that brings this intelligence to life, i.e., we are the intelligence in the giant networked computer “brain”. I can query the Web to find out where my friends are only because they have stated their position. A measure of the Net’s intelligence could be based on the data it contains, the questions we imagine to ask, and our ability to ask them.
I feel inspired as an Internet software developer to make the process of interaction, contribution, and connection as easy as possible. How can I make it easier to contribute? Simplified markup is one option. Easy authentication is another. Improved data collection, such as automated geo positioning via mobile devices, and mining enable us to contribute implicitly. How can I reduce the barrier to entry?
Yahoo!’s recent brand campaign stated that the Internet is all about “you”. One way to interpret this is that Yahoo! facilitates contributions to, and recognition of, one’s self online. I spend so much of my time online it seems like a second home. So much of my persona involves how I see myself reflected in the Internet. Yahoo!, and many other services, tries to make it easy to be online and manifest a personality there. This is one way to describe of the process of growing the Web. This propagation of the Web could be summarized as: building physical and logical connections between people, and allowing them to input and retrieve data. I’m curious to see if the Web does seem to become more intelligent relative to the success of these activities.
A few current touch points involving the simplification of Internet interaction, i.e., interaction with the Net itself, come to mind: establishing network infrastructure (how easy can we make it to set up an internet access point? this shouldn’t be a bottleneck); creating and maintaining online identities (oauth, openid); storing the collected data in easily retrievable formats (semantic web, search, open gov, freebase, wikipedia); processing big data using mapreduce; server-side processing with web hooks and app engines; providing easy access to the processed data via asynchronous communication, key/value interfaces, convenient off-network “connect” access to social data; using apps on existing networks, and mobile devices, for easy delivery and consumption, esp. iphone and android.
standard stack v1: git
preamble
we’ll use git to facilitate the process of pushing code to the vm. because there’s a cardinal rule about not serving files from a repo, we’ll need to create a git host and use a githooks to update the web root when code is pushed to the repo. i’m using the terms hub and prime introduced by Joe Maller in his post A web-focused Git workflow.
i don’t have a cool picture of the concept, like Maller did, but here’s one of a cute red panda (credit: tambako) to set the mood before we get started:

ok, here we go:
terms
- prime is the copy of the repo accessible by the web server
- hub is the bare source of truth repository
- project refers to the prime/hub pair
- vm is the vmware vm running centos
- laptop is the development computer you ultimately want to push files from
environment
- mac os x 10.5.8
- vmware 2.0.5
- centos 5.4
- git 1.5.5.6
steps
- set up
- on the vm, install git as root:
yum install git - on the vm, create a user to handle git-related activity:
useradd git - on the vm, get its inet ip address using ifconfig:
ifconfig - on the vm, copy your rsa public key (you’ll be pushing git updates over ssh) from your laptop into the git user’s .ssh/authorize_keys file on the vm
- on the vm, make sure the correct permissions are set on the authorized_keys file and .ssh dir:
chmod 700 /home/git/.ssh; chmod 644 /home/git/.ssh/authorized_keys - on your laptop, run a sanity check by logging into the vm via public key. note: if you’re using an alternate ssh port and/or different pub key file name, define these in your laptop’s .ssh/config file:
ssh git@{ip address} - on the vm, in /var/www/, as root, create a directory that git can push content to (note: if the dir isn’t owned by git or isn’t world-writable, git throws an “error: cannot open .git/FETCH_HEAD: Permission denied” error):
mkdir /var/www/git/; chown git:git /var/www/git/ - on the vm, cd into the /var/www/git/ directory and su to the git user:
cd /var/www/git/; su git
- on the vm, install git as root:
- create a new project
- on the vm, create a new directory {proj name} for the prime repo and cd into it:
mkdir proj; cd proj - on the vm, initialize a git repo:
git init - on the vm, create and add a file so we can clone prime later (git dissallows cloning an empty repo):
touch readme;
git add readme;
git commit -m ‘initial commit’
Note: if you haven’t already told git who you are, run:
git config user.email “example.com@domain.com”
git config user.name “example.com” - on the vm, define a remote repository for the soon-to-be-created hub:
git remote add origin /home/git/proj - on the vm, cd into git user’s home directory:
cd ~ - on the vm, create the hub repo by cloning the newly created repo using the –bare flag (that’s a double ‘-’ before bare):
git clone –bare /var/www/git/proj - on the vm, create a post-update hook in the hub repo to update the web directory when an update is pushed. open /home/git/proj/hooks/post-update and add the following:
# jump into web dir cd /var/www/sites/example.com/ # w/o this, git throws "fatal: Not a git repository: '.'" error # ref: http://bit.ly/5lieqQ unset GIT_DIR # pull in the updates git pull origin master
- on the vm, create a new directory {proj name} for the prime repo and cd into it:
- start working
- on the laptop, open a terminal on whatever machine your going to develop on and clone the new host repo:
git clone git@{ip address}:proj - on the laptop, edit the readme file in the repo, check in the change and observe in the output the results of the hook-initiated pull
- on the laptop, view http://{ip address}/readme to confirm the new code is displaying
- on the laptop, open a terminal on whatever machine your going to develop on and clone the new host repo:
references
- A web-focused Git workflow
- linus torvald’s suggestion to use .ssh/config to define alternate port
- James Strachan’s helpful clarification on why we run unset GIT_DIR before git pull
setting up nginx and mochiweb on centos 5
- Install nginx on centos using cyberciti’s tutorial
- update default iptables to allow http traffic:
# ref: http://www.cyberciti.biz/faq/redhat-fedora-ip6tables-firewall-configuration/ # ref: http://wiki.zimbra.com/index.php?title=Firewall_Configuration # Firewall configuration written by system-config-securitylevel # Manual customization of this file is not recommended. *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT -A RH-Firewall-1-INPUT -p 50 -j ACCEPT -A RH-Firewall-1-INPUT -p 51 -j ACCEPT -A RH-Firewall-1-INPUT -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT -A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --dport 631 -j ACCEPT -A RH-Firewall-1-INPUT -m tcp -p tcp --dport 80 -j ACCEPT -A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT
- install mochiweb using BeeBole’s tutorial. For ease of use while testing, launch dev server using separate screen, as the mochiweb shell will own the terminal used to launched it by default, and add the following line to iptables so we can hit the server directly:
-A RH-Firewall-1-INPUT -m tcp -p tcp --dport 8000 -j ACCEPT # allow access to mochiweb
Test that mochiweb is available to localhost by running the following from the command line on the server:
curl http://127.0.0.1:8000
You should get something back like:
<html>
<head>
<title>It Worked</title>
</head>
<body>
MochiWeb running.
</body>
</html> - Configure nginx to proxy api calls to mochiweb. Put this in /etc/nginx/nginx.conf:
user nginx; worker_processes 1; error_log /var/log/nginx/error.log; pid /var/run/nginx.pid; events { worker_connections 1024; } http { include /etc/nginx/mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] $request ' '"$status" $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; keepalive_timeout 65; include /etc/nginx/conf.d/*.conf; server { listen 80; server_name localhost; location ~ api { # <-- pass requests for 'api...' to mochiweb proxy_pass http://127.0.0.1:8000; } location / { root /usr/share/nginx/html; index index.html index.htm; } error_page 404 /404.html; location = /404.html { root /usr/share/nginx/html; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } }As per BeeBole’s tutorial, edit the mochiweb request handler to handle requests for ‘api’:
%% @author author <author@example.com> %% @copyright YYYY author. %% @doc Web server for myapp. -module(myapp_web). -author('author <author@example.com>'). -export([start/1, stop/0, loop/2]). %% External API start(Options) -> {DocRoot, Options1} = get_option(docroot, Options), Loop = fun (Req) -> ?MODULE:loop(Req, DocRoot) end, mochiweb_http:start([{name, ?MODULE}, {loop, Loop} | Options1]). stop() -> mochiweb_http:stop(?MODULE). loop(Req, DocRoot) -> "/" ++ Path = Req:get(path), case Req:get(method) of Method when Method =:= 'GET'; Method =:= 'HEAD' -> case Path of "api" -> Req:ok({"text/html", [],["<h1>Congratulation</h1>"]}); % <-- the 'api' request handler _ -> Req:serve_file(Path, DocRoot) end; 'POST' -> case Path of _ -> Req:not_found() end; _ -> Req:respond({501, [], []}) end. %% Internal API get_option(Option, Options) -> {proplists:get_value(Option, Options), proplists:delete(Option, Options)}.As per James Gardner’s post Streaming File Upload with Erlang and Mochiweb Multipart Post, rebuild the request handler by running make in the myapp directory. The mochiweb server will automatically restart
- confirm the proxy is working by hitting http://domain/ and http://domain/api. The former should return the nginx install confirmation page, and the latter should return the simple “Congratulation” page.
steps for merging changes from a remote clone of a git repo
I’m a fan of github, but I don’t know how to apply changes made to a clone of my repo, usually announced via a pull request. The goal of this post, then, is to define these steps. Note: the steps below pulled in the changes as desired, but also auto-committed them despite the —no-commit flag, so these steps need refinement.
prereq
- a git repo named origin
- committer has issued a pull request. For this example, I’ll use a committer named FooBaz
steps
- add commiter’s repo as a remote
- copy clone url for pull requester’s repo, eg git://github.com/FooBaz/yql-tables.git
- define remote repo:
git remote add FooBaz git://github.com/FooBaz/yql-tables.git - view list of remotes as sanity check:
git remote show
- pull in FooBaz’s changes:
- run:
git pull --no-commit FooBaz master - note: this actually committed the changes for me
- run:
- push changes to origin repo:
git push origin master
ref
Dav Glass’ YQL module for YUI 3 is awesome
sample app:
<ul>
<li><img/></li>
</ul>
<script type="text/javascript" src="http://yui.yahooapis.com/3.0.0/build/yui/yui-min.js"></script>
<script type="text/javascript" src="http://github.com/davglass/yui-yql/raw/master/yql-min.js"></script>
<script>
//ref: http://davglass.github.com/yui-yql/
YUI().use('yql', 'node', function(Y) {
var q1 = new Y.yql('select source from flickr.photos.sizes where photo_id in (select id from flickr.photos.search where text="panda" and safe_search="true")');
q1.on('query', function(r) {
var li = Y.get('li');
for (var i = 0; i < r.results.size.length; i++) {
if (-1 !== r.results.size[i].source.indexOf('_s')) {
var clone = li.cloneNode(true);
clone.query('img').set('src', r.results.size[i].source);
Y.get('ul').append(clone);
}
}
});
});
</script>
centos 5 yum update error & resolution
I just tried to update my centos 5 install via yum and got the following error messages:
filelists.sqlite.bz2 | 1.5 MB 00:01
http://centos.eecs.wsu.edu/5.4/updates/i386/repodata/filelists.sqlite.bz2: [Errno -1] Metadata file does not match checksum
Trying other mirror.
filelists.sqlite.bz2 | 1.1 MB 00:00
http://mirror.facebook.net/centos/5.4/updates/i386/repodata/filelists.sqlite.bz2: [Errno -1] Metadata file does not match checksum
Trying other mirror.
....
I searched on line for “yum update Metadata file does not match checksum” and found a helpful blog post. Following the post suggestion, I ran yum clean all, which seems to have fixed the problem.
Simpleton Pattern
Simpleton pattern
Problem
We don’t want multiple objects of the same class, but we also don’t want to clutter our code with the kind of checks required to implement the Singleton pattern
Solution
Kill program execution if a second attempt to instantiate an object occurs
Example
class Foo {
function constructor(){
if(objectOfClassAlreadyExists('Foo')){
stopExecution();
}
...
}
}
...
variable foo = new Foo //all good
...
variable bar = new Foo //poof!




