If you want to get it done quickly and easily, use this software: http://sitemonitoring.sourceforge.net/ (it can browse your website using spider, load all pages from sitemap etc. and check if your pages contain some string) ... btw. I wrote it and use it for such tasks. And also it's written in Java :-)
If you want to code it yourself, check out http://jsoup.org/ (it's also used in sitemonitoring).