Printable Version of Topic

Click here to view this topic in its original format

Unmanned Spaceflight.com _ Chit Chat _ New Google Coop Search

Posted by: helvick Oct 24 2006, 09:18 PM

Google have significantly improved their Google Co-Op tagged\focussed search capability at some stage recently so that it is now a lot better. I gave it a twirl to see what it could do and created a UMSF themed (but not restricted) Co-Op search page by pulling out a small subset of my bookmarks:

http://google.com/coop/cse?cx=013223497075414387822%3Ai1hnrhwchn4

At the moment the basic search includes the following:
www.unmannedspaceflight.com
ntrs.nasa.gov
orca.phys.uvic.ca
www.lpl.arizona.edu
pubs.giss.nasa.gov
athena1.cornell.edu
www.spacedaily.com
www.badastronomy.com
www.nasa.gov
www.isas.ac.jp
www.federalspace.ru
marsrovers.jpl.nasa.gov
pluto.jhuapl.edu
messenger.jhuapl.edu
saturn.jpl.nasa.gov
www.esa.int
marsprogram.jpl.nasa.gov
hiroc.lpl.arizona.edu
www.astro.cornell.edu
www.planetary.org

I've also added two "refinements" that allow for drilling in to unmannedspaceflight.com only (this is the same as a google search with site:unmannedspaceflight.com added but it might be useful to have a "UMSF and Related sites" option) and also an attempt to get an unmannedspaceflight.com images search that isn't working right.

Any suggestions on sites to add? remove?

It seems to work quite well for me but I wont really be able to tell until I'm actually going looking for something that I'm finding hard to locate.

Posted by: babakm Oct 24 2006, 09:50 PM

Great Idea! Here are a couple more sites:

www.msss.com
*.jpl.nasa.gov
www.jaxa.jp

Posted by: helvick Oct 24 2006, 09:58 PM

Thanks. For some reason Google have disabled the ability to add wildcards within domain names (so *.jpl.nasa.gov can't be added).

It is possible to add a long list of sites sites (up to 2000) as a single large xml file so if anyone knows an easy way to get a list of all of the useful nasa.gov sub-domain names then I can hack that together fairly quickly.

Posted by: Stephen Oct 25 2006, 08:21 AM

QUOTE (helvick @ Oct 25 2006, 07:58 AM) *
For some reason Google have disabled the ability to add wildcards within domain names (so *.jpl.nasa.gov can't be added).

Doesn't Google's "site" command work?

I find (for ordinary Google searches):
site:jpl.nasa.nov == *.jpl.nasa.gov
That is, "site:jpl.nasa.nov" will confine a search to sites with hostnames which end in "jpl.nasa.gov", such as "http://www.jpl.nasa.gov', "http://marsrovers.jpl.nasa.gov", "http://saturn.jpl.nasa.gov", etc, etc.

======
Stephen

Posted by: helvick Oct 25 2006, 09:43 AM

The site: filter still works as it always did within the standard Google search box. It can be used as part of a refinement and it can also be added by the user to a custom search query if they choose. However, I was referring to the Customised Search interface which says that you have to be precise about domain names. The helpd file states that:

QUOTE
Entire sites: You can specify an entire site to be included or excluded. For example, specifying www.mysite.com/* will include or exclude all the pages on www.mysite.com. Note that mysite.com does NOT match subsite.mysite.com, so be specific.

You can use wildcards in the non domain portion of the url so www.y.com/*kites will match any page within a url that has a name that includes the word kites.

Posted by: Stephen Oct 25 2006, 11:04 AM

QUOTE (helvick @ Oct 25 2006, 07:43 PM) *
The site: filter still works as it always did within the standard Google search box. It can be used as part of a refinement and it can also be added by the user to a custom search query if they choose. However, I was referring to the Customised Search interface which says that you have to be precise about domain names. The helpd file states that: "Entire sites: You can specify an entire site to be included or excluded. For example, specifying www.mysite.com/* will include or exclude all the pages on www.mysite.com. Note that mysite.com does NOT match subsite.mysite.com, so be specific."

That's going to be a real pain then because there are a lots of websites in NASA.gov alone.

======
Stephen

EDIT: Make that dozens! I did a quick google using "site:nasa.gov" and came up with page upon page of them. Here's a small fraction (not all to do with UMSF, of course):

www.nas.nasa.gov
www.jpl.nasa.gov
gcmd.nasa.gov
ldcm.nasa.gov
www.wff.nasa.gov
solarsystem.nasa.gov
newfrontiers.nasa.gov
www.giss.nasa.gov
www.universe.nasa.gov
sunearthday.nasa.gov
hubble.nasa.gov
history.nasa.gov
softwarereuse.nasa.gov
www.sewp.nasa.gov
www.visibleearth.nasa.gov
nccs.nasa.gov
www.wstf.nasa.gov
aqua.nasa.gov
esto.nasa.gov
www.nsbf.nasa.gov

Posted by: djellison Oct 25 2006, 11:34 AM

I would add.....

http://photojournal.jpl.nasa.gov
and
http://pds.jpl.nasa.gov/

Posted by: helvick Oct 25 2006, 02:42 PM

I've added those and the xml annotation upload function works pretty well so the exercise is pretty painless provided we have a list of urls'.

Also someone volunteered to help via the collaboration page but the collaboration function doesn't actually tell me who it is so I'm at bit wary of just accepting it. If it's someone here then let me know via pm and I'll clear it and if anyone else wants to join in do the same.

Now I honestly don't know if anyone else will find it worthwhile but I'm finding stuff that I find interesting e.g.
http://ntrs.nasa.gov/search.jsp?R=909465&id=4&qs=Ne%3D25%26N%3D4294967255%2B131

smile.gif

Posted by: ElkGroveDan Oct 25 2006, 03:20 PM

I had no idea that Google had this capability. My mind is spinning over the possibilities. Thanks for demonstrating this.

Posted by: elakdawalla Oct 25 2006, 03:54 PM

Here's one I stumbled across a while ago, the JPL technical reports server!
http://trs-new.jpl.nasa.gov/dspace/

There are lots of miscellaneous European universities and institutes hosting instrument-specific websites. Do you want to add those?

--Emily

Posted by: helvick Oct 27 2006, 04:17 PM

For those with either Firefox 2.0 or IE7 I've hacked together a custom Search Plugin that you can install by going to http://helvick.googlepages.com/Helvicksfae.html

If you have either of those browsers then you should see that the search plugin is auto discovered when you navigate to that page.

This should work on any other browser that supports Amazon's Opensearch standard but I've only tested it on FF2\IE7.

Emily - I have ntrs.nasa.gov included already and that seems to get to a technical report server at least. The url you provided always errors out when I try it. I am outside of the US so that may be a factor.

Posted by: jaredGalen Oct 27 2006, 05:12 PM

Helvick, speaking of search engines, have you seen http://www.yubnub.org?

Can actually be very powerfully, defining custom searches and commands, the potential could be there to define custom search tags for directing searches through the coop search you have set up.

I use it to give me access to multiple search engines from a single search bar, e.g. y cassini uses yahoo to search, g cassini uses google etc.

Just said I'd mention it.

Posted by: helvick Oct 27 2006, 07:07 PM

Jared,

That's damn interesting - thanks, I 'll have to dig into it to see how to make it do what I want but it really looks like a very clever approach to the web. I already use Firefox keyword shortcuts so that entering g search term in the address bar searches google for my search term or imdb search term does the same for imdb, u opens up umsf on the "View new posts" page which makes getting my fix really easy. This is a way to offload these shortcuts to the web so that they work on any browser anywhere not just on my instance of FF

Neat.

JoeM

Powered by Invision Power Board (http://www.invisionboard.com)
© Invision Power Services (http://www.invisionpower.com)