Its a BUG in Java: URL incorrectly removes path leaf when the relative spec is query only (RFC1808) The description contains a workaround.
Thanks. I see that no ready solution for my task. I will port my own c++ code.
– JakeLi Jan 3 at 13:25.
Reading javadoc of URL(URL context, String spec) provides the best answer to your question: If the spec's path component begins with a slash character "/" then the path is treated as absolute and the spec path replaces the context path. Otherwise, the path is treated as a relative path and is appended to the context path, as described in RFC2396. Also, in this case, the path is canonicalized through the removal of directory changes made by occurences of ".." and ".".
Since your URL context URL ends without slash, it's getting removed. Try to add slash: old = "domain/script/?param.
I'm making sitemap. I need to download each page and extract all urls (a href) from it. Urls are different - start with?
, /, . /, ../, absolute, etc. Need to process them all. Sorry, I didn't get your best answer.
In my sample url starts with "? " (it's query part). – JakeLi Jan 3 at 11:11 Sorry, I edited the answer, hope now it's more clear.
– Tarlog Jan 3 at 12:38.
I am having a page like mypage.com/a/b/somePage.html, and one of the anchors href attribute is something like "a/b/anotherPage. When I try to get the absolute url from by creating a new url object with page url and href value, I am getting the absolute url as mypage.com/a/b/a/b/anotherPage.html. This is causing problem for me, but some how the browsers are handling this properly.
Any out of box things available to solve this problem.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.