With this module you can extract news from sites that don't use RSS.

You need to know the page structure of the HTML page. Then you can build a XPath query to extract title, link, date and text of the news.

You can learn more about XPath syntax from W3School:
http://www.w3schools.com/xpath/xpath_syntax.asp