RestTemplate 与 Gzip Content-Encoding

1. 问题描述

最近做一个针对 Yarn Application 进行错误诊断的需求,需要从 Resource Manager 获取 Application 的运行信息,比如:

 GET /ws/v1/cluster/apps/application_1561545353229_936285 HTTP/1.1
 Host: yarn.xxx.com
 Accept: */*
 accept-encoding: gzip, deflate

这个接口使用 Postman 可以得到对应的结果:

 {
     "app": {
         "id": "application_1561545353229_936285",
         "user": "bp_growth",
         "name": "moses:278648",
         "queue": "root.bp_growth_dev",
         "state": "FINISHED",
         "finalStatus": "FAILED",
         "progress": 100,
         "trackingUI": "History",
         "trackingUrl": "http://yarn-rm02.tc.rack.xxx.com:8088/proxy/application_1561545353229_936285/",
         "diagnostics": "Task failed task_1561545353229_936285_m_000008\nJob failed as tasks failed. failedMaps:1 failedReduces:0\n",
         "clusterId": 1561545353229,
         "applicationType": "MAPREDUCE",
         "applicationTags": "",
         "startedTime": 1562740151124,
         "finishedTime": 1562740195218,
         "elapsedTime": 44094,
         "amContainerLogs": "http://data1117.tc.rack.xxx.com:8042/node/containerlogs/container_e93_1561545353229_936285_01_000001/bp_growth",
         "amHostHttpAddress": "data1117.tc.rack.xxx.com:8042",
         "allocatedMB": -1,
         "allocatedVCores": -1,
         "reservedMB": -1,
         "reservedVCores": -1,
         "runningContainers": -1,
         "memorySeconds": 1583688,
         "vcoreSeconds": 721,
         "preemptedResourceMB": 0,
         "preemptedResourceVCores": 0,
         "numNonAMContainerPreempted": 0,
         "numAMContainerPreempted": 0,
         "logAggregationStatus": "TIME_OUT"
     }
 }

但是使用 RestTemplate 去请求的时候却会得到异常:

 org.springframework.http.converter.HttpMessageNotReadableException: JSON parse error: Illegal character ((CTRL-CHAR, code 31)): only regular white space (\r, \n, \t) is allowed between tokens; nested exception is com.fasterxml.jackson.core.JsonParseException: Illegal character ((CTRL-CHAR, code 31)): only regular white space (\r, \n, \t) is allowed between tokens
  at [Source: (PushbackInputStream); line: 1, column: 2]

原因是 Response 的结果使用了 Gzip 进行压缩。

2. 解决方案

在网上找到的最简单的解决方案来自:How to parse gzip encoded response with RestTemplate from Spring-Web

简单来说就是在构造 RestTemplate 的时候指定 ClientHttpRequestFactory。

Maven 项目引入一个依赖:

 <dependency>
   <groupId>org.apache.httpcomponents</groupId>
   <artifactId>httpclient</artifactId>
 </dependency>

由于我同时使用了 SpringBoot,所以不需要指定版本。

具体使用时如:

   @Test
   public void test() {
 
     HttpComponentsClientHttpRequestFactory clientHttpRequestFactory = new HttpComponentsClientHttpRequestFactory(
         HttpClientBuilder.create().build());
     RestTemplate restTemplate = new RestTemplate(clientHttpRequestFactory);
 
     Map forObject = restTemplate.getForObject("http://yarn.xxx.com/ws/v1/cluster/apps/application_1561545353229_936285", Map.class);
     System.out.println("forObject = " + forObject);
   }