Curiosity Image Retrieval Tools, scripts and software |
Curiosity Image Retrieval Tools, scripts and software |
Sep 27 2012, 04:07 PM
Post
#166
|
|
Senior Member Group: Members Posts: 1619 Joined: 12-February 06 From: Bergerac - FR Member No.: 678 |
curl is fine too.
Just an example, to how to download a bunch of frame, from "url/folder/0001.jpeg" to "url/folder/0010.jpeg" : cd destination curl -O url/folder/00[01-10].jpeg Works like a charm . -------------------- |
|
|
Sep 27 2012, 07:04 PM
Post
#167
|
|
Member Group: Members Posts: 154 Joined: 19-September 12 Member No.: 6658 |
You do know you can have wget on OSX, right? I installed mac ports to have all the usual tools, including wget but if you don't want to install mac ports apparently you can download the source code from gnu.org and do the usual installation (tar, ./configure, make, make install). If you google "mac osx wget" the first link has the details. If you are used to work under linux, mac ports is your friend. Paolo Thanks Paolo! Yes I know mac ports and it would not be a big problem to get wget working ... but the mac is my postproduction workhorse and I'm "officially" not allowed to install software there. On the other hand ... I just thought to put the curl version in here - it might be helpful to anybody Ronald |
|
|
Sep 28 2012, 01:38 PM
Post
#168
|
|
Senior Member Group: Members Posts: 1074 Joined: 21-September 07 From: Québec, Canada Member No.: 3908 |
I've been trying to use Joe's "Curiosity search engine" today, and a search for "sol51" turns up no results, even though navcam images are posted on the MSL raw images pages. Also, a search for "sol50" doesn't list the mastcam images that I see on the raw images page. Is there something I am doing wrong?
|
|
|
Sep 28 2012, 01:48 PM
Post
#169
|
|
Junior Member Group: Members Posts: 43 Joined: 7-August 12 From: The Netherlands Member No.: 6493 |
Where urls.txt is a plain list of the image urls - you need to remove the wget stuff in the generated list ... Thanks, glad you find it useful! You can change the default pre-fix (currently wget) for your own convenience through the little menu behind the "Wrench" icon. Currently it states: "wget -i -", in your case that would be: "xargs -n 1 curl -O ". Or you have to option to entirely skip the addition of the command stuff from the generated list, again from the same menu. Greetings, Ludo. |
|
|
Sep 28 2012, 05:14 PM
Post
#170
|
|
Senior Member Group: Members Posts: 4247 Joined: 17-January 05 Member No.: 152 |
a search for "sol51" turns up no results Not just Joe's search site, but also his main image site is stuck. Maybe a change in the format of the jpl site?
|
|
|
Sep 28 2012, 08:58 PM
Post
#171
|
|
Junior Member Group: Members Posts: 43 Joined: 7-August 12 From: The Netherlands Member No.: 6493 |
As far as I can tell no changes on the JPL site. So we'll just have to wait for Joe to look at it.
My listing site is currently up to date, and includes an early preview of a mosaic/grid thumbnail view. Very early work, so probably not useful yet. See: http://msl-raw-images.appspot.com/listing.html |
|
|
Sep 28 2012, 11:38 PM
Post
#172
|
|
Senior Member Group: Members Posts: 1465 Joined: 9-February 04 From: Columbus OH USA Member No.: 13 |
Not just Joe's search site, but also his main image site is stuck. Maybe a change in the format of the jpl site? Ack... why do I change things (upgrade ubuntu on the server) right before I go on vacation? There was a security feature on the new mysql that I needed to enable. Anyway, there should be 314 or so new images now. Heading for Martha's Vineyard & a lot of water-worn pebbles. Driving through Pennsylvania is cool with all the road cuts showing the layered rocks. -------------------- |
|
|
Oct 6 2012, 09:46 AM
Post
#173
|
|
Junior Member Group: Members Posts: 43 Joined: 7-August 12 From: The Netherlands Member No.: 6493 |
Hi all,
Time for a status update on the http://msl-raw-images.appspot.com site: The site has run in a degraded mode for a couple of weeks, due to deeply rooted scalability issues. Finally I've found some time to get to the problem and fix it. Since a couple of days the service is fully up and running, including twitter updates @MSLRawImages, running on the 5 minutes update cycle I like it to be on. I like to thank all the visitors that have sticked with me during this time. Now I'm back to developing features and layout, starting with a couple of cleanups:
Greetings, Ludo. |
|
|
Oct 6 2012, 06:15 PM
Post
#174
|
|
Junior Member Group: Members Posts: 60 Joined: 3-January 09 Member No.: 4520 |
Nice work, Ludo. Yeah, I suspect the number of pictures have gotten into the "limit/offset and index or die" stage of database usage (16,000+ listed on MSL Raws, sounds about right).
Joe, as promised, here's some database tuning advice. Right now, page hits are taking roughly 10 seconds. This is probably because the select query is returning all 16,000 rows, then tossing most and just displaying the 50 its got, or (less likely) the query takes 10 seconds to sort the results the right way. The latter is a little trickier than the former to fix. The former is pretty easy. MySQL and PostgreSQL both support "LIMIT" and "OFFSET" clauses, which will restrict the data transferred between the database and the webserver. So, for example, "SELECT ... WHERE .... LIMIT 50 OFFSET 100". Most quality web libraries also allow you to "page" database results. You might be surprised just how expensive shipping data out of a database really is. Databases are much happier doing a lot of crunching on their end, then returning tiny little results. The latter problem, getting the results sorted quickly, is simple in theory. Add indexes. You can get an order of 100x speedup with database indexes, with only a small cost of insertion and update speed. The tricky part is getting the right indexes. Databases tend to ignore indexes unless they're exactly right. I can go into how to do that if you're interested. |
|
|
Oct 6 2012, 09:31 PM
Post
#175
|
|
Senior Member Group: Members Posts: 1465 Joined: 9-February 04 From: Columbus OH USA Member No.: 13 |
maschnitz, are you referring to the original page at http://curiositymsl.com? If so, the queries are already limited like so, for example:
SELECT * FROM msl WHERE visible=1 ORDER BY etreleased desc LIMIT 1,100 So the slowdown is coming from somewhere else. I'm getting about 3-4 seconds for a 100 results/page query. I think the bulk of the delay is loading the thumbnails. On the road right now, but I can look at it more closely later. I think there's a linux app for profiling mysql that I can look into. The newer website at http://curiostymsl.com/search uses a much more complex query although it also is limited (to 50 results/query). BTW, both sites use jquery per your suggestion! -------------------- |
|
|
Oct 6 2012, 09:49 PM
Post
#176
|
|
Junior Member Group: Members Posts: 60 Joined: 3-January 09 Member No.: 4520 |
Yeah, the root page. Hm. OK. I checked the root page with Firebug - the initial page hit took 4 seconds; the thumbnails as a whole took 2 seconds. There's not a lot you can about the thumbnails besides hosting them yourself.
4 seconds could be better. The index page I think is starting to suffer under all the pictures in the result set, then. Probably will just get worse. Nice work with the limit. I strongly suspect, then, that you need a few indexes. The way to confirm this is to try running the front-page SQL query by hand against the database; then run it again without the order-by clause. If I'm right, the query will run a LOT faster without the order-by. So, something you could try is to create an index. Something like "create index etrel_idx on msl (etreleased)". See if that speeds it up. You might need one of those for each of the columns you can sort by on the front page. (Sometimes that's not enough, though - long story.) |
|
|
Oct 6 2012, 11:08 PM
Post
#177
|
|
Senior Member Group: Members Posts: 1465 Joined: 9-February 04 From: Columbus OH USA Member No.: 13 |
I'm not indexing on etreleased or ettaken (maybe the most likely search fields), but even so the queries take only 50 msec or so for 100 items. I find that the time taken to load a page pretty much scales with the number of items displayed (20, 50 or 100). I.e., with a cleared cache, about 2 seconds for 20 items and 9 seconds for 100 items. My guess right now is that besides loading the thumbnails (which can be 10KB each) most of the remaining time is taken up converting the etreleased and ettaken times to a string date (they're stored in the database as ephemeris time, seconds since 1/1/2000). The php script does exec calls to external programs to do these conversions. At least, the time taken doesn't scale with the number of items in the database, but is limited to the number of items in the result set, 20, 50 or 100. As it is, I guess a way to speed it up would be to store the string dates in the database.
-------------------- |
|
|
Oct 7 2012, 12:11 AM
Post
#178
|
|
Junior Member Group: Members Posts: 60 Joined: 3-January 09 Member No.: 4520 |
Very odd. 4 seconds is an eternity for a computer. Nothing you listed should take more than a second, in theory.
The images in the web page I see are coming direct from JPL's site, after the index page loads in the browser. The four seconds I'm talking about is the amount of time it takes for the index page to load. Are you loading the images from JPL inside PHP, too? (Maybe to get the width and height? If so, that'd be a great thing to stash in the database, just to avoid talking to JPL during your PHP script. Talking to JPL directly from PHP is gonna take some time.) Of the things you listed, the calls out to convert the dates are the most suspect. You should be able to do that from within PHP. Here's a trick you can use. Ephemeris time is quite similar to UNIX time - epoch seconds since 1/1/2000, vs. midnight 1/1/1970. You can convert from ephemeris time to UNIX time just by adding 946684800 (the number of seconds between 1970 and 2000). Then PHP has a couple of mechanisms for making a UNIX time a usable date object - eg under PHP 5.3+, you can use: $unixtime = $ephemeris + 946684800; $date = new DateTime('@' . $unixtime); If this does the right thing, this will be a lot faster than shelling out. |
|
|
Oct 7 2012, 10:51 AM
Post
#179
|
|
Senior Member Group: Members Posts: 1465 Joined: 9-February 04 From: Columbus OH USA Member No.: 13 |
If this does the right thing, this will be a lot faster than shelling out. Just gave it a quick try--looks like it will speed up the page considerably (like from 9 seconds down to 2 seconds for 100 items in my test). I need to get the exact offset between et (J2000) and unix time since J2000 is January 1, 2000, 11:58:55.816 UTC. Heading home today & so will do the update sometime this evening. Thanks! -------------------- |
|
|
Oct 7 2012, 04:50 PM
Post
#180
|
|
Administrator Group: Admin Posts: 5172 Joined: 4-August 05 From: Pasadena, CA, USA, Earth Member No.: 454 |
Just a comment to say, carry on, guys -- I'm learning lots from this discussion! And the continuous tweaks and improvements to the image retrieval tools are great
-------------------- My website - My Patreon - @elakdawalla on Twitter - Please support unmannedspaceflight.com by donating here.
|
|
|
Lo-Fi Version | Time is now: 23rd May 2024 - 03:01 PM |
RULES AND GUIDELINES Please read the Forum Rules and Guidelines before posting. IMAGE COPYRIGHT |
OPINIONS AND MODERATION Opinions expressed on UnmannedSpaceflight.com are those of the individual posters and do not necessarily reflect the opinions of UnmannedSpaceflight.com or The Planetary Society. The all-volunteer UnmannedSpaceflight.com moderation team is wholly independent of The Planetary Society. The Planetary Society has no influence over decisions made by the UnmannedSpaceflight.com moderators. |
SUPPORT THE FORUM Unmannedspaceflight.com is funded by the Planetary Society. Please consider supporting our work and many other projects by donating to the Society or becoming a member. |