网络爬虫+模拟浏览器(获取有权限网站资源):
获取URL
下载资源
分析
处理
public class http {
public static void main(String[]args) throws Exception
{
//http+s更安全
//URL.openStream()打开于URL的连接,并返回一个InputStream用于从连接中读取数据
//获取URL
URL url=new URL("https://www.jd.com");
//下载资源
InputStream is = url.openStream();
BufferedReader br=new BufferedReader(new InputStreamReader(is,"UTF-8"));;
String msg=null;
while((msg=br.readLine())!=null)
{
System.out.println(msg);
}
br.close();
}
}
获取有权限网络资源:
public class http {
public static void main(String[]args) throws Exception
{
//.openConnectio,,返回一个URLConnection实例表示由所引用的远程对象的连接URL
//URLConnection的子类有HttpURLConnection和JarURLConnection
URL url=new URL("https://www.jd.com");
//下载资源
HttpURLConnection conn=(HttpURLConnection)url.openConnection();
conn.setRequestMethod("GET");//模拟浏览器得get请求
conn.setRequestProperty( "User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763");
BufferedReader br=new BufferedReader(new InputStreamReader(conn.getInputStream(),"UTF-8"));
String msg=null;
while((msg=br.readLine())!=null)
{
System.out.println(msg);
}
br.close();
}
}
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。