Home
Find Project
Find Expert
My City
rss En Cn
Forum
My Resume
Login
Sign Up
Post Project
java spider
Status : Finished
Expert : Internationalcoders
Buyer : Anne

java spider

Project Progress
1 Choose Bid
2 Sign Contract with Milestones
3 Working
4 Rate and Close
Dashboard Message Board Bid Management Contract and Pay Rate and Close Arbitration Managment Tools
Plan Tracking Members Statistics Tasks Whiteboards Work Log Files
Budget : $1,000-5,000
Estimated development time : 7 Days
Skill :

MySQL Java PHP Search

Description

I want a have spider modified or built which ever is easier. You can use existing opensource libraries or anything, it doesn't matter as long as it acheives the tasks.


I want to be able to run the spider as an applet and from the command line so that i can be execute as a cron job.




The spider must be able to accept command line arguments eg.


main(String args[]) { String var = args[0]}




and the applet should have a simple gui.




The spider should be able to take in the domain name and crawl that domain only unless the option is choosen for the spider to leave the domain. It must have the option to re-index if html page has changed.




It should check header status of a page and does not index unless the page is available, so status 200 etc.




--------------- Specs ---------------------------




Spider gets full html page contents


if the html tag i want to check for (eg <object></object>) is found then


Parse all html tags


get : array of tags i specify


example String getTags[]={"title","keyword"}




if keyword is empty/missing and description or title not empty then


split description at every word


return array of keywords limit to 250


else if title empty


attempt to extract keywords from html body up 250 words


if the html tag i checked for is not found then do not parse the page just get all the links from the page and continue crawling.




the crawler need to be able to return the values of html tags and their attributes that i specify.




I'd like the values returned to be in an associative array/map so that




myObject['title'] will contain the title


myObject['keyword'] will contain an array of keywords


myObject['tagName']['Attribute'] will get the attribute value of the html tag example


myObject['embed']['src']




Lastly i want the data to be inserted/indexed in my mySQL database but only if the html tag i checked for was found.






Please make sure you read and udnerstand the reqiurments. This will be integrated into one of my projects and it needs to be built according to my specs.




The spider can be a modification to the one found here


http://www.developer.com/java/other/article.php/1573761/Programming-a-Spider-in-Java.htm




or here


http://www.javaworld.com/javaworld/jw-11-2004/jw-1101-spider.html




or anything from the net


http://www.google.com/search?hl=en&source=hp&q=java +web+spider&aq=0&oq=java+web+spid&aqi=g2




or if you already have a class or library that does this.




It doesn't matter i just want a spider customized to do the above.




Escrow payment only... No automated bids please.


Project Bids

  Expert Location Message last login
Has Portfolios
2

Birapsales

Texas USA
0
about 18 hours
Has Portfolios
2

Igloo360

Pakistan
0
1 day
Has Portfolios
2

Harshamm

Pakistan
0
about 3 hours
Has Portfolios
2

Internationalcoders (Win Bid)

Pakistan
0
3 days
Has Portfolios
2

Dimashia

Mississippi USA
0
about 9 hours

Comments

Want to send a private message to buyer?Click here
(500 characters or less)
Validate Code
校验码
Send

Project Buyer

Anne
  • Anne
  • Project Posted 79
  • Last login at 1 day ago

Place Bid

Login to submit a proposal.
Haven't signed up yet? Please sign up to become a Taskcity user right away.

Related Project Spotlight

Courier service management small scale project

Data scrapper engine

Hire Android Developer | Hire iPhone Developer | Android Application Development | iPhone Application Development | JAVA/J2ME/J2EE Development | Mobile Development | Outsource iPhone Development | Outsource Android Development | Hire PHP Programmer | Hire PHP Developer | Web Application Development | Outsource Web Development | Windows Phone Developer | Hire Web Developer
Privacy Policy | Terms of Service | About Us | Contact Us | Guest Book
Copyright (c) 2007-2011 Taskcity All rights reserved