Expand Shortened URLs with Apache HttpClient – 用Apache HttpClient扩展缩短的URLs

最后修改: 2013年 8月 10日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this article, we’re going to show how to expand URLs using HttpClient.

在这篇文章中,我们将展示如何使用扩展URLsHttpClient

A simple example is when the original URL has been shortened once – by a service such as bit.ly.

一个简单的例子是,原始URL已经被缩短过一次–由bit.ly这样的服务。

A more complex example is when the URL has been shortened multiple times, by different such services, and it takes multiple passes to get to the original full URL.

一个更复杂的例子是,当URL被不同的此类服务缩短了多次,并且需要多次传递才能获得原始的完整URL。

If you want to dig deeper and learn other cool things you can do with the HttpClient – head on over to the main HttpClient tutorial.

如果你想更深入地了解你可以用HttpClient做的其他很酷的事情–请到主要HttpClient教程

2. Expand the URL Once

2.扩展URL一次

Let’s start simple, by expanding a URL that has only been passed through a shorten URL service once.

让我们从简单的开始,通过扩展一个只通过缩短URL服务一次的URL。

The first thing we’ll need is an HTTP client that doesn’t automatically follow redirects:

我们首先需要的是一个HTTP客户端,不会自动跟随重定向

CloseableHttpClient client = 
  HttpClientBuilder.create().disableRedirectHandling().build();

This is necessary because we’ll need to manually intercept the redirect response and extract information out of it.

这是必要的,因为我们需要手动拦截重定向响应并从中提取信息。

We start by sending a request to the shortened URL – the response we get back will be a 301 Moved Permanently.

我们首先向缩短后的URL发送一个请求–我们得到的响应将是一个301 Moved Permanently

Then, we need to extract the Location header pointing to the next, and in this case – the final URL:

然后,我们需要提取Location标头,指向下一个,在这种情况下–最后的URL。

public String expandSingleLevel(String url) throws IOException {
    HttpHead request = null;
    try {
        request = new HttpHead(url);
        HttpResponse httpResponse = client.execute(request);

        int statusCode = httpResponse.getStatusLine().getStatusCode();
        if (statusCode != 301 && statusCode != 302) {
            return url;
        }
        Header[] headers = httpResponse.getHeaders(HttpHeaders.LOCATION);
        Preconditions.checkState(headers.length == 1);
        String newUrl = headers[0].getValue();
        return newUrl;
    } catch (IllegalArgumentException uriEx) {
        return url;
    } finally {
        if (request != null) {
            request.releaseConnection();
        }
    }
}

Finally, a simple live test with an “un-shortened” URL:

最后,用一个 “未缩短的 “URL进行简单的现场测试。

@Test
public final void givenShortenedOnce_whenUrlIsExpanded_thenCorrectResult() throws IOException {
    final String expectedResult = "https://www.baeldung.com/rest-versioning";
    final String actualResult = expandSingleLevel("http://bit.ly/3LScTri");
    assertThat(actualResult, equalTo(expectedResult));
}

3. Process Multiple URL Levels

3.处理多个URL级别

The problem with short URLs is that they may be shortened multiple times, by altogether different services. Expanding such an URL will need multiple passes to get to the original URL.

短网址的问题是,它们可能被缩短多次,被完全不同的服务。扩展这样的URL将需要多次通行,以获得原始URL。

We’re going to apply the expandSingleLevel primitive operation defined previously to simply iterate through all the intermediary URLs and get to the final target:

我们将应用之前定义的expandSingleLevel原始操作,简单地遍历所有中间的URL,并到达最终目标

public String expand(String urlArg) throws IOException {
    String originalUrl = urlArg;
    String newUrl = expandSingleLevel(originalUrl);
    while (!originalUrl.equals(newUrl)) {
        originalUrl = newUrl;
        newUrl = expandSingleLevel(originalUrl);
    }
    return newUrl;
}

Now, with the new mechanism of expanding multiple levels of URLs, let’s define a test and put this to work:

现在,有了扩展多级URL的新机制,让我们定义一个测试,并将其用于工作。

@Test
public final void givenShortenedMultiple_whenUrlIsExpanded_thenCorrectResult() throws IOException {
    final String expectedResult = "https://www.baeldung.com/rest-versioning";
    final String actualResult = expand("http://t.co/e4rDDbnzmk");
    assertThat(actualResult, equalTo(expectedResult));
}

This time, the short URL – http://t.co/e4rDDbnzmk – which is actually shortened twice – once via bit.ly and a second time via the t.co service – is correctly expanded to the original URL.

这一次,短网址–http://t.co/e4rDDbnzmk–实际上被缩短了两次–一次是通过bit.ly,第二次是通过t.co服务–被正确扩展为原始网址。

4. Detect on Redirect Loops

4.对重定向循环的检测

Finally, some URLs cannot be expanded because they form a redirect loop. This type of problem would be detected by the HttpClient, but since we turned off the automatic follow of redirects, it no longer does.

最后,一些URL不能被展开,因为它们形成了一个重定向循环。这种类型的问题会被HttpClient检测到,但由于我们关闭了自动跟随重定向的功能,它不再检测。

The final step in the URL expansion mechanism is going to be detecting the redirect loops and failing fast in case such a loop occurs.

URL扩展机制的最后一步将是检测重定向循环,并在发生这种循环时快速失败。

For this to be effective, we need some additional information out of the expandSingleLevel method we defined earlier – mainly, we need to also return the status code of the response along with the URL.

为了使之有效,我们需要从我们先前定义的expandSingleLevel方法中获得一些额外的信息–主要是,我们还需要将响应的状态代码与URL一起返回。

Since java doesn’t support multiple return values, we’re going to wrap the information in an org.apache.commons.lang3.tuple.Pair object – the new signature of the method will now be:

由于java不支持多个返回值,我们将org.apache.commons.lang3.tuple.Pair对象来包裹信息–现在方法的新签名将是。

public Pair<Integer, String> expandSingleLevelSafe(String url) throws IOException {

And finally, let’s include the redirect cycle detection in the main expand mechanism:

最后,让我们把重定向循环检测纳入主扩展机制。

public String expandSafe(String urlArg) throws IOException {
    String originalUrl = urlArg;
    String newUrl = expandSingleLevelSafe(originalUrl).getRight();
    List<String> alreadyVisited = Lists.newArrayList(originalUrl, newUrl);
    while (!originalUrl.equals(newUrl)) {
        originalUrl = newUrl;
        Pair<Integer, String> statusAndUrl = expandSingleLevelSafe(originalUrl);
        newUrl = statusAndUrl.getRight();
        boolean isRedirect = statusAndUrl.getLeft() == 301 || statusAndUrl.getLeft() == 302;
        if (isRedirect && alreadyVisited.contains(newUrl)) {
            throw new IllegalStateException("Likely a redirect loop");
        }
        alreadyVisited.add(newUrl);
    }
    return newUrl;
}

And that’s it – the expandSafe mechanism is able to expand URL going through an arbitrary number of URL shortening services, while correctly failing fast on redirect loops.

就是这样–expandSafe机制能够通过任意数量的URL缩短服务来扩展URL,同时正确地在重定向循环中快速失败。

5. Conclusion

5.结论

This tutorial discussed how to expand short URLs in java – using the Apache HttpClient.

本教程讨论了如何在java中扩展短URLs–使用ApacheHttpClient

We started with a simple use case with a URL that is only shortened once and then implemented a more generic mechanism, capable of handling multiple levels of redirects and detecting redirect loops in the process.

我们从一个只有一次缩短的URL的简单用例开始,然后实现了一个更通用的机制,能够处理多层次的重定向并在这个过程中检测重定向循环。

The implementation of these examples is available over on GitHub.

这些示例的实现可在GitHub上获得