Office of Operations Freight Management and Operations

3.0 FREIGHT MOVEMENT BY AIR

3.1 Introduction

Air cargo is a key part of the overall freight transported in terms of its dollar-value, time-sensitivity issue, and its reliance on other shipment modes. This report outlines a method to integrate US Census Bureau value data with the Department of Transportation (DOT), Office of Airline Information (OAI) weight data in order to develop two datasets containing commodity value and weight of air shipments by origin-destination for domestic shipments and origin-port of entry/exit-destination for international shipments. The domestic dataset will contain value and weight of air shipments by origin, destination and commodity. The international dataset will contain value and weight of air shipments by origin-port of entry/exit-destination and commodity.

The aviation component of the Provisional Commodity Origin Destination Matrix, hereafter the Provisional Matrix, combines Office of Airline Information (OAI) data on the weight of shipments for the U.S. airline industry with Census/Customs (hereafter Census) data on commodity-type, value and weight for imports and exports by air, and the FAF2 domestic aviation value and weight data. The major reasons to use OAI data are the ability to estimate a port-of entry/exit and that it is considered the definitive source for tons shipped of U.S. air freight. While the Census data does provide a port-of-entry/exit, these are based on the port in which a shipment clears customs rather than the first port after/before crossing the border. The main reasons for using the Census data are the availability of information on commodity-type and value. The major contribution of the FAF2 domestic aviation data is to capture commodity-type and value differences between the international and domestic data. This report specifies the process for combining the OAI, Census and FAF2 data and the methodologies for estimating the port-of-entry/exit and for forecasting data for months that have not yet been reported.

3.2 Data Sources

3.2.1 OAI Data

The Office of Airline Information (OAI-BTS/RITA) publishes the Form 41T-100 and T-100 (f) traffic data monthly on both a market and segment basis. The T-100 data contains information on the weight of air freight and mail by carrier, origin airport, and destination airport, as well as additional identifying and operational information. The OAI data is considered the definitive source of tons-shipped for the U.S. airline industry. OAI shipments are defined differently than FAF2 shipments in that OAI shipments use an airport basis (from airport origin to airport destination) rather than an establishment basis. In OAI market data, airport origin-destination refers to tons enplaned by a specific carrier at the origin airport and deplaned by the carrier at the destination airport. The T-100 market data will exclude the port-of-entry/exit whenever the port is an intermediate stop for the shipment. Origin-destination for each record on the segment component of the T-100 data refers to a non-stop leg and reports tons transported rather than tons enplaned. The T-100 segment data will include the port of entry/exit for international shipments, but will exclude the ultimate origin/destination when a shipment has multiple stops. Combining the market and segment data to add ports-of-entry/exit is one of the main objectives of this project. The T-100 data covering freight shipments by U.S. carriers is publicly available approximately sixty days after the end-of-month and the T100 (f) data covering foreign carriers is publicly av ailable approximately six months after the end-of-month. The data can be found at http://www.transtats.bts.gov/ (the T100 (f) data is included in the versions having all carriers).

Two other differences between the OAI data and FAF2 are the lack of information on the value and commodity-composition of shipments. In order to provide information for FAF2 international air shipments, U.S. Customs data on commodity-type and value is combined with the OAI data.

The coverage of the OAI data may be summarized with a few aggregate statistics. In 2003, freight data was recorded for almost 1,500 airports worldwide. About 600 of these were international airports where they were engaged in shipments between the U.S. and other countries. About 200 of these international airports were located in the U.S. and its territories. The OAI T-100 data covers large certificated U.S. commercial carriers; since 10/2002, commuter and small certificated carriers are covered as well, although these will account for only a negligible amount of international air shipments. The T-100 (f) covers foreign carriers serving the U.S. Included in these carriers are parcel, courier, and express carriers, which are treated as a separate mode in FAF2. In 2003, the T-100 and T-100 (f) showed 244 air carriers shipping freight in the U.S., and 188 carriers shipping freight between the U.S. and other countries (119 of these were foreign carriers). Like FAF2, the public version of the T-100 data excludes in-transit shipments from the market data and foreign-to-foreign shipments from the segment data, however see the Additional Notes on the Data below for a qualification. The T-100 data does not include private or illegal shipments of freight and passenger baggage is not counted as freight.

3.2.2 Census Foreign Trade Data

The Census Bureau Foreign Trade Division (FTD) (http://www.census.gov/foreign-trade/reference/products/index.html) publishes two monthly paid subscription series that largely satisfy the need for International Air data. The data is collected by the U.S. Customs Service and published as: 1) U.S. Exports of Merchandise – Monthly – DVD ROM. (information on the value, quantity, method of transportation, and shipping weights for 9,000 export commodities, 240 trading partners, and 45 Districts; 2) U.S. Imports of Merchandise – Monthly – DVD ROM (data on more than 17,000 commodities for 240 trading partners and 45 Districts. The data CDs provide value, quantity, method of transportation, shipping weights, import charges, duties and much more.) Shipments are for all merchandise between foreign countries and U.S. Customs Territories (50 states, District of Columbia, Puerto Rico, the U.S. Virgin Islands, and U.S. Foreign Trade Zones). The objective is to capture the physical movement of merchandise between foreign countries and the U.S. and includes government and non-government shipments and does not depend on the shipment being part of a commercial transaction.

A shipment's origin-destination on the Census data is based on Customs Districts and where the shipment is processed by the Customs Service. For FAF2 purposes it is important to note that a Customs Districts may include more than one state and a state may have more than one Customs District. The Export data satisfies the need for mode-destination-port of origin-tonnage-dollar value, but lacks port-of-exit data. The Import data satisfies the need for mode-origin-destination-tonnage-dollar value, but defines port-of-entry as the port in which the shipment clears customs rather than the first port after crossing the border. Commodities are reported using the 10-digit Harmonized Tariff Schedule (Schedule B for exports) which can be translated to SCTG using a crosswalk provided by FHWA. Export values are reported free-alongside-ship (F.A.S.) Import values are available both by customs-import-value (C.I.V) which excludes duties, freight, insurance and other costs of importation or by customs-insurance-freight, which adds freight and insurance to the C.I.V For FAF2 the C.I.F. values are used to better reflect the shipment's value at the border. The data is available approximately three months after the end-of-month. Export data is recorded in the month in which the shipment leaves the country, corresponding to the FAF2 definition. However, import data is recorded in the month in which it clears customs and may therefore not correspond to the month the shipment was transported into the co untry due to time spent in bonded warehouses or Foreign Trade Zones (FTZs). Like FAF2, the Census data excludes in-transit shipments. Although the Census Bureau data provides vital information for the FAF2 project, there is also a substantial on-going cost to subscribe to the dataset, currently $2,700/year for both Imports and Exports. Therefore it may be useful to consider a related subset of data that is available on-line for $75 for a one-month subscription at http://www.usatradeonline.gov/usatrade.nsf?Open&mc=F9000 for future use. Appendix A compares the dimensions of the data sources used to produce the provisional estimates.

3.2.3 Further Notes on the Data

  1. Although the T-100 market/segment data includes information on the largest cargo carriers, it excludes information for some all-cargo carriers.
  2. The methodology below can be applied to cargo and mail either separately or together as freight and mail combined. One concern with using cargo and mail separately is an on-going dispute between Federal Express and OAI as to how U.S. mail should be reported. Federal Express lumps mail with freight due to concerns about disclosing the size of its contract with the U.S. Postal Service. Cargo by itself will then tend to overstate actual cargo shipments. For purposes of this report, combined freight and mail is used while recognizing that FAF2 treats parcels/mail as a separate mode.
  3. Neither of the OAI datasets gives information from the initial origin at the manufacturer or the ultimate destination at the purchaser of the products. However, it is very unlikely that a given shipment of air freight is originated outside of the FAF zone where the airport is located. Also it is very unlikely that a shipment of air cargo will be transported outside the FAF region where the airport is located. One more issue is that the data is carrier-based, so a shipment that involves more than one carrier will have misrepresent the initial origin and ultimate destination of the shipment. The methodology here assigns airport origins and destinations and will distribute to ultimate origins and destinations based on the methodology used for the 2002 FAF2.
  4. The OAI market data is reported by carriers and covers enplanements and deplanements of freight and mail. Although the public version of the dataset excludes in-transit shipments, it is likely that some in-transits are included since a shipment that changed either carriers or planes would not be excluded.
  5. There was a substantial expansion in coverage of the T-100 OAI data in October 2002 to include all-cargo carriers, small-certificated and commuter carriers. For estimation of growth rates across years involving 2002, carrier growth rates in revenue ton-miles of freight and mail (available from the T1 data) were used to backcast 2003 monthly data for individual routes for the largest all-cargo carriers. For example, the FedEx tons enplaned at Memphis and deplaned in Seattle in January 2003 would be decreased by the January 2002-January 2003 growth rate in FedEx domestic revenue ton-miles of freight and mail to obtain an estimate of January 2002 tons shipped by FedEx between Memphis and Seattle.

3.3 Combining Census and OAI Data

3.3.1 Cross-Walks for Commodity and Geographic Information

Combining the OAI and Census data into a FAF2 dataset requires reconciling the different levels of detail at which commodity and geographic identifying information. In the case of commodity-types and values, the OAI data is at a more general level than is required by FAF2 – a topic that is covered in the sections below on estimation. This section covers the cross-walks used to reconcile differences between the commodity-types on the Census and FAF2 datasets and the geographic information on all three datasets.

Several of cross-walks were already available from FHWA. Commodity cross-walks between the Harmonized System used in the Census Foreign Trade files to the SCTG codes used in FAF2 are available at http://ops.fhwa.dot.gov/freight/freight_analysis/faf/faf2_tech_document.htm. Cross-walks between countries and foreign trade regions are available upon request from FHWA (contact Tianjia Tang at Tianjia.Tang@fhwa.dot.gov). A third cross-walk from U.S. counties to FAF regions was also provided by FHWA.

The cross-walks to be developed are translations between different levels of specificity for geographic information between the OAI data and Census/FAF2. The OAI geography is based on airports, the most specific level of detail, and is used as link between the other two. Each airport is assigned to both a Customs District and a FAF2 region so that the relevant (dis)aggregation can be accomplished.

The first cross-walk developed for FAF2 International Aviation is from U.S. airports to counties, which is used in combination with the existing cross-walk from counties to FAF2 regions. The matching process requires two supplemental files: the Master Coordinates File (MCF) from OAI, available at http://www.transtats.bts.gov/Tables.asp?DB_ID=595&DB_Name=Aviation%20Support%20Tables&DB_Short_Name=Aviation%20Support%20Tables, and the county subdivision file from Census, available at http://www.census.gov/geo/www/gazetteer/places2k.html.

Two other sources also proved useful when the assignment of an airport to a county was unresolved from the first round of processing: Mapquest ® at http://www.mapquest.com/maps/ and the National Association of Counties website at http://www.naco.org/Template.cfm?Section=Data_and_Demographics&Template=/cffiles/counties/city_srch.cfm. Both the MCF and County Subdivision files have information on the state and on latitude and longitude. Within each state, the airports are matched to the two closest county subdivisions. Two subdivisions are matched because an airport may be near the border of its actual county and closer to the geographic center of another county. When the two closest subdivisions were in the same county, the airport was assigned to that county. When the two closest subdivisions were in different counties, the airport city name from the MCF was used to determine the county using either Mapquest ® or the National Association of Counties website.

The second cross-walk developed for FAF2 International Aviation is from U.S. airports to U.S. Customs Districts. As above, the MCF provides information on airports in the form of airport name, state, city name, and latitude and longitude. In order to assign airports to Customs Districts a hierarchical matching method is used. Matching airports to Customs Districts is more complicated because Customs Districts are less uniform than counties, i.e., a state may have multiple Customs Districts, no named District or Sub-District, or a Customs District may span more than one state. While Customs Sub-Districts also consist of places in the usual geographic sense of cities or regions, they may also be airports and business places (e.g. FedEx processing centers).

Matching Customs Sub-Districts to Airports

  1. For those Sub-Districts which are also airports, assign the airport to that Sub-District. A list of Customs Districts/Sub-Districts is available at http://www.census.gov/foreign-trade/schedules/d/dist.txt.
  2. For the remaining Sub-Districts, match the Sub-District name to a Census Place Name (a list of census place names, that includes latitude and longitude, is also available at http://www.census.gov/geo/www/gazetteer/places2k.html). This process required much hand-editing and also the use of supplemental information from the CFR, customs, Mapquest ® and the National Association of Counties websites.
  3. Use the latitude and longitude information available from both the Census Places file and the MCF to determine the closest Sub-Districts for each airport. Choose between the two based on the airport city name or additional information. Note that an airport may be, and often is, on the outskirts of its actual place, and therefore closer to a second place.

The matching process resulted in each U.S. airport being assigned to a Customs Sub-District.

3.3.2 Estimating Flows by Weight

Estimation for domestic and international data is substantially different, with the estimation for domestic data being straight-forward. Estimation of domestic data consists of calculating growth rates from OAI domestic market data by FAF origin region between the CFS survey year and the provisional year required for FAF. These growth rates are then applied to the individual commodity weights from 2002 FAF by origin region to obtain the estimates.

The OAI market data for international shipments is missing the port-of-entry/exit while the Census foreign trade data is missing the port-of-exit for exports and the port-of-entry for imports does not necessarily correspond to the FAF definition of port-of-entry. This section outlines a procedure for reconciling the differences between the two datasets and assigning a port-of-entry/exit to the OAI market data based on the OAI segment data. The guiding philosophy behind the algorithm is to impose aggregate efficiency by minimizing the distance transported at each step. The specification of the algorithm is based on a port-of-exit. The extension to a port-of-entry is straight forward.

Notation:
Superscripts:
1st Position: M = market data, S = segment data, F = FAF2 results.
2nd Position: t = time period. Time periods are annual.
Subscripts:

1st Position: i = origin airport
Last Position: j = destination airport
Intermediate Position in the case of 3 nodes: k = port-of-entry/exit

T = tons shipped.

  1. For market routes that match non-stop segment routes for both origin and destination and by carrier, assign the min(TMtij, TStij,) to TFtij and reduce both TMtij and TStij by TFtij to obtain residual tonnage for each market route and port.
  2. Determine the remaining market and segment routes from the remainders from step a) and all market and segment routes that did not match in step a).
  3. Create two-leg routes from the segment data in which the origin and destination match the origin and destination from the market routes in step b) with the intermediate stop restricted to be domestic: TStikj = min(TStik, TStkj) where {i, j} correspond to {i, j} from b) and k is domestic.
    1. For each carrier and each market route, find the best (based on shortest distance) intermediate stop. Let the distance for this route be given by dist(ik1j).
    2. For each carrier and each market route, find the second best intermediate stop with distance given by dist(ik2j).
    3. Calculate the cost-savings for each route of using the best intermediate stop = dist(ik2j) - dist(ik1j).
    4. For each carrier, find the route which gives the greatest cost savings and denote this route (ikj)*. Then let TFtikj* = min(TMtij, TStikj*).
    5. Bookkeeping: Reduce TMtij, TSik. and TS.kj by TFtikj* for the carrier.
    6. Repeat steps a-e until all market routes have been evaluated for all carriers.
  4. Determine the remaining market and segment routes from the remainders from step b) and those routes for which no two-leg routes could be formed.
  5. For each carrier, aggregate airports to their FAF region level. Recalculate distance as the ratio of ton-miles to tons transported rather than airport-to-airport distance.
  6. Rerun step c) using the FAF region level rather than the airport level.
  7. For the international routes that were unmatched in steps a, c, and f assign the port-of-entry/exit to be the domestic destination/origin.
  8. Create an international dataset based on international routes from steps a, c and f, plus the international market routes from step g). (Note: Step h) is actually done in tandem with the assignment of commodity-type and value. This aspect is excluded here for simplicity.)
  9. Aggregate the results over carriers and airports to FAF regions.

The result of this algorithm will be a dataset with shipment weights by origin-port of exit-destination. Matching is done at the carrier level to preserve the correspondence between market and segment data in the OAI datasets. Appendix B, Table B1, provides round-by-round results of the estimation. Round 1 corresponds to step a), round 2 to step c), round 3 to step f), and round 4 to step h). For imports (exports), rounds 2 and 3 assign a different port-of-entry (exit) than the domestic destination (origin). These two rounds accounted for about 17% of imports and 13% of exports. A large majority of the data, more than 70% for both imports and exports, is assigned its port-of-entry/exit in the first round where it is equal to the original port-of-entry (exit) for imports (exports).

3.3.3 Estimating Commodity Composition and Value

3.3.3.1 International Routes

Commodity composition and value are available from Census for exports at the domestic origin, and for imports at the domestic destination based on the Census geographic definitions. The Census information is used to estimate commodity composition and value for the OAI data. The corresponding domestic origin for exports and domestic destination for imports from the OAI data will be referred to as the matching ports.

The first step in the estimation process is to determine whether it is reasonable to combine the two data sources. Evidence that combing the data is reasonable is given by the high correlation for tonnage values at the matching ports between the two data sources (see the bottom of Table B2 in Appendix B). Two caveats to this estimation need to be noted. The first is that the OAI data is about 20% larger than the Census data for both imports and exports. Although there are several differences between the data sources, the strongest explanation is that the difference is due to the OAI data including more in-transit shipments than the Census data. The OAI data is based on carrier reporting with market routes defined by enplanement and deplanement of the cargo. An in-transit shipment that switches carriers in the U.S., or which is transferred from one plane to another by the same carrier, would appear as an import/export on the OAI data. However, the same shipment would be more likely to be designated as in-transit in Customs' reporting to Census. Additional evidence that differences are due to in-transit shipments is that the Customs Districts with the largest differences are also likely transshipment ports: New York, Miami, and Anchorage. The large-differences-for-a-few Customs Districts is the second caveat as this property affects whether it is reasonable to apply Census information to the OAI data on a district-by-district basis. The second caveat is addressed in the estimation process.

The estimation philosophy is to assign the Census commodity distribution and value-per-ton by commodity (prices) to the OAI data for each matching port while keeping the aggregate commodity distribution and prices equal to that for the Census data. Because of the differences in tonnage for some key ports a straight-forward port-by-port application would result in large differences at the aggregate level. The first step in the port-by-port estimation is to rescale Census exports/imports by the ratio of the respective OAI-to-Census aggregates. The approach taken here has two parts. For the share of a matching port's tons that is on both the OAI and rescaled Census data, the distribution and prices are taken directly from the Census data for that port. The remainder can be either excess Census tons, or excess OAI tons. The commodities and values for matching ports with excess Census tons are then aggregated to define residual commodity shares and prices. The residual commodity shares and prices are then applied to those Customs Districts with excess OAI tons. The result is an OAI-based dataset that reflects the Census commodity distribution and prices at the aggregate level and also captures a large share of Census port-level differences in commodities and prices.

More formally, the estimation algorithm can be written in terms of exports as follows: Notation:
Superscripts:
1st Position: O = OAI data, C = Census data, , R = residual of OAI minus Census, F = FAF2 results.
2nd Position: t = time period. Time periods are annual.
Subscripts:
1st Position: i = origin Customs District, i=1,…,I, I=41.
2nd Position: j = commodity j, j=1,…,J, J=33
When the capital of the subscript letter is used, it denotes the sum over all values of the subscript.

T = tons shipped.
V = value.

α = a 33x1 vector of commodity shares, the expression small alpha subscript small i small j end subscript superscript capital C small t end superscript end expression is equal to the fraction expression begin numerator capital T subscript small i small j end subscript superscript capital C small t end superscript end numerator over begin denominator summation symbol lower bound small j equals 1 end lower bound upper bound capital J end upper bound capital T subscript small i small j end subscript superscript capital C small t end superscript end denominator end fraction expression is equal to fraction expression begin numerator T subscript small i small j end subscript superscript capital C small t end superscript end numerator over begin denominator capital T subscript small i capital J end subscript superscript capital C small t end superscript end denominator end fraction expression.

p = a 33x1 vector of commodity prices, the expression small p subscript small i small j end subscript superscript capital C small t end superscript end expression is equal to the fraction expression begin numerator capital V subscript small i small j end subscript superscript capital C small t end superscript end numerator over begin denominator capital T subscript small i small j end subscript superscript capital C small t end superscript end denominator end fraction expression.

σ = the export scale factor = the fraction expression begin numerator summation symbol lower bound small i equals 1 end lower bound upper bound capital I end upper bound summation symbol lower bound small j equals 1 end lower bound upper bound capital J end upper bound capital T subscript small i small j small i end subscript superscript capital O small t end superscript end numerator over begin denominator summation symbol lower bound small i equals 1 end lower bound upper bound Capital I end upper bound summation symbol lower bound small j equals 1  end lower bound upper bound capital J end upper bound capital T subscript small i small j end subscript superscript capital C small t end superscript end denominator end fraction expression is equal to the fraction expression begin numerator capital T subscript capital I capital J end subscript superscript capital O small t end superscript end numerator over begin denominator capital T subscript capital I capital J end subscript superscript capital C small t end superscript end denominator end fraction expression.

  1. Let the expression capital T subscript small i capital J end subscript superscript capital R small t end superscript end expression is equal to the expression capital T subscript small i capital J end subscript superscript capital O small t end superscript minus small sigma capital T subscript small i capital J end subscript superscript capital C small t end superscript end expression.
  2. Let the expression capital A end expression is equal to the expression left brace small i vertical bar capital T subscript small i capital J end subscript superscript capital R small t end superscript is less than 0 right brace end expression  period and period the expression capital B end expression is equal to the expression left brace small i vertical bar capital T subscript small i capital J end subscript superscript capital R small t end superscript is greater than 0 right brace end expression
  3. Let the expression small alpha subscript small j end subscript superscript capital A small t end superscript end expression is equal to the fraction expression begin numerator summation symbol lower bound small i is an element of capital A end lower bound capital T subscript small i small j end subscript superscript capital R small t end superscript end numerator over begin denominator summation symbol lower bound small i is an element of capital A end lower bound capital T subscript small i capital J end subscript superscript capital R small t end superscript end denominator end fraction expression
  4. Let the expression small p subscript small i small j end subscript superscript capital A small t end superscript end expression is equal to the fraction expression begin numerator summation symbol lower bound small i is an element of capital A end lower bound small p subscript small i small j end subscript superscript capital C small t end superscript times expression  small alpha subscript small i small j end subscript superscript capital C small t end superscript end expression times expression capital T subscript small i capital J end expression end numerator over begin denominator summation symbol lower bound small i is an element of capital A end lower bound small alpha subscript small i small j end subscript superscript capital C small t end superscript end expression times expression capital T subscript small i small j end subscript superscript capital R small T end superscript end denominator end fraction expression end expression.
  5. Let the expression capital M subscript small i capital J end subscript superscript small t end superscript end expression is equal to Min left parenthesis capital T subscript small i capital J end subscript superscript capital O small t end superscript comma small sigma capital T subscript small i capital J end subscript superscript capital C small t end superscript right parenthesis
  6. Then the expression capital T subscript small i small j end subscript superscript capital F small t end superscript end expression is equal to the expression small alpha subscript small i small j end subscript superscript capital C small t end superscript capital M subscript small i capital J end subscript superscript small t end superscript end expression plus expression max left parenthesis alpha subscript small j end subscript superscript capital A small t end superscript capital T subscript small i capital J end subscript superscript capital R small t end superscript comma 0 right parenthesis end expression.
  7. and the expression small alpha subscript small i small j end subscript superscript capital F small t end superscript end expression is equal to the fraction expression begin numerator capital T subscript small i small j end subscript superscript capital F small t end superscript end numerator over begin denominator capital T subscript small i capital J end subscript superscript capital F small t end superscript end denominator end fraction expression.
  8. and the expression small p subscript small i small j end subscript superscript capital F small t end superscript end expression is equal to the fraction expression begin numerator left bracket expression small p subscript small i small j end subscript superscript capital C small t end superscript end expression times expression small alpha subscript small i small j end subscript superscript capital C small t end superscript end expression times expression capital M subscript small i capital J end subscript superscript small t end superscript end expression plus expression max left parenthesis expression small p subscript small i small j end subscript superscript capital A small t end superscript end expression times expression small alpha subscript small j end subscript superscript capital A small t end superscript end expression times expression capital T subscript small i capital J end subscript superscript capital R small t end superscript end expression right bracket end numerator over begin denominator capital T subscript small i small j end subscript superscript capital F small t end superscript end denominator end fraction expression.

The resulting FAF commodity shares and prices are then applied at the airport level before aggregating to create tons and value at the FAF regional level.

3.3.3.2 Domestic Routes

The only information available on commodity distribution and prices for domestic routes is the 2002 CFS survey, which is also the basis for the 2002 FAF database, and this is used as the base for estimation6. In order to more accurately reflect the values for non-survey years, the commodity price information from the CFS is updated using the commodity price data from Census on exports. Exports are used for two reasons: exports more closely resemble domestic production than imports and, for 2002, commodity shares and prices of exports are more highly correlated with CFS commodity shares and prices. In particular, commodity prices are calculated at the national level for exports for the provisional estimation year, and then the ratio of the commodity price in the provisional year to the 2002 level is used to inflate/deflate domestic commodity prices obtained from 2002 FAF data.

Domestic air freight commodity shares are unchanged at the individual route level, or when aggregated by origin FAF region since individual commodity weights are estimated to grow at the same rate as the weight of OAI shipments from the origin FAF region. Note, however, that commodity shares for destination regions can change since they receive shipments from more than one origin region.

3.4 Forecasting Data for the Remainder of the Year

The first release of current year estimates are to be made available in December of the current year so forecasts are required for weight, value and the commodity distribution for unreported months7. For both Census data and domestic shipments by domestic carriers on the OAI data, the missing data consists of the fourth quarters of the most recent year. Foreign carriers on the OAI data (who have minimal domestic shipments) and domestic carriers' international shipments will require the third and fourth quarters to be forecast. The specific forecast techniques are selected based on historical evidence, but the basic approach is to first forecast tons shipped based on the OAI data, and then to forecast values and the commodity distribution based on the Census data. The forecasts use the most recent data on annual changes to update the available data from the fourth quarter of the previous year. Using the available data from the fourth quarter helps to retain the seasonal pattern for routes, commodity distribution and relative prices for the fourth quarter of the current year. The results of the forecasts are then used to supplement the available data so that the methods described above for estimating air freight flows can be applied.

The specific technique will be part of the broad class called time-series techniques. The general alternative to time-series techniques are model-based techniques which hypothesize relations between variables and estimate a model based on those relations. The problem with model-based techniques for the FAF2 is that the use of variables outside the database (e.g. fuel prices) restricts how the forecast data can be used (e.g. how does the price of fuel affect congestion) for independent study. In the case of using fuel prices to help forecast missing data, the effect of fuel prices would be pre-determined by the forecast model rather than reflecting actual conditions. Time-series techniques in contrast use only the past histories of the variables of concern to forecast the future.

3.4.1 Forecasting the OAI Data for Tons Shipped

There have been two significant events that have changed the characteristics of the OAI data and limited the efficacy of using the history of the series prior to 2002. The first is the 9/11/01 terrorist attacks which had a profound direct effect on aviation. The second is the carrier coverage of the T-100 data which expanded in 10/2002 to include small-certificated, commuter and all-cargo carriers. Carriers that began full-reporting in 2002 will be referred to as new-reporters while those who fully reported prior to 2002 will be referred to as prior-reporters. The primary impact of this change is on domestic tons shipped because international operations were already reported prior to 10/2002. Of particular significance, domestic operations of Federal Express were not publicly reported prior to 10/2002. Given these events, the historical period used as a base for forecasting is restricted to 2002 and later. The growth rates that are the basis of the forecasts are also restricted to depend only on information from the previous year to allow for an evolving trend following September 11. The limited availability of data reduces the number of parameters which can be estimated and the ability to apply standard statistical tests. For these reasons, simple techniques that depend on only one estimated parameter were considered.

The techniques examined consist of using data on annual growth rates between the previous and current calendar year and then applying these growth rates to missing quarter(s) from the previous calendar year. For example, one of the forecasts for domestic carriers uses the growth rate from the third quarter of the previous year to the third quarter of the current year. The forecast for the fourth quarter of the current year is obtained by applying this growth rate to the level of tons enplaned in the fourth quarter of the previous year. Annual growth rates are used to avoid seasonal effects which may have also have changed since September 11. The forecasts considered differ along three dimensions: whether the time-period used to calculate the growth rates is the year-to-date (YTD) or the most recent completed quarter (depending on availability) relative to the same period in the previous year, whether to forecast domestic and international routes separately or in combination, and whether to forecast prior- and new-reporting carriers separately or in combination.

Carriers who are late in reporting their data to OAI may also be a problem with the most recent data. To correct for the missing carrier effect, an adjustment is made to the data for the most recent year. The adjustment is based on the assumption that late-reporting carriers grew at the same rate as those who reported on time. Adjusted growth rates are calculated for each month after January with the growth rate for each month based on aggregate enplaned tons from the subset of carriers who reported in both the current and previous month. The adjusted growth rate is then consecutively applied to each month after January, subject to a constraint that the adjusted aggregate enplaned tons is greater than aggregate enplaned tons obtained directly from the data (since the adjustment is to account for late reporters).

3.4.1.1 Mathematical Specification of the Adjusted Tons and Forecasts8:

the expression capital T subscript small t comma small i comma small j comma small k end subscript superscript fraction expression begin numerator small m end numerator over begin denominator small q comma small h end denominator end fraction expression end superscript end expression. = tons enplaned and the expression capital F subscript t comma small i comma small j comma small k end subscript superscript fraction expression begin numerator small m end numerator over begin denominator small q comma small h end denominator end fraction expression end superscript end expression. = the forecast of tons enplaned as defined below.

Let m/q = m the current month / q the current quarter. M = the latest month for which data is available for the carrier type (in the current year (generally September for domestic carriers and June for foreign carriers), and Q = the latest quarter for which data is available for the carrier type in the current year (generally the third quarter for domestic carriers and the second quarter for foreign carriers).

Let h = n, r, a, u index carrier subsets, where n = no restrictions on carriers included for the respective group, r = carriers are restricted to have reported in both the current and previous month, a = adjusted for missing carriers and u = unadjusted for missing carriers.

Let t = y indicates the current year.

Let i=b, d, s, c index route groupings, where b = international (border-crossing) routes, d=domestic routes, s = the sum of forecasts of international and domestic routes, and c = the forecast based on the growth rate of combined domestic and international routes.

Let j = n, p, s, c, f index carrier groups, where n = new-reporters, p = prior-reporters, s = the aggregate of separate domestic carrier groups (the sum n and p forecast individually) and c = domestic carriers combined over both new- and prior-reporters, and f = foreign carriers.

Let k = 1,2 where 1 indicates growth rates based on the most recent available quarter, and 2 indicates growth rates based on the most recent available year-to-date.

Calculation of Adjusted Growth Rates

Let the expression capital G subscript small y comma small i comma small j comma dot end subscript superscript small m comma small r end superscript end expression is equal to the fraction expression begin numerator capital T subscript small y comma small i comma small j end subscript superscript small m comma small r end superscript end numerator over begin denominator capital T subscript small y comma small i comma small j end subscript superscript small m minus 1 comma small r end superscript end fraction expression, for m = 2,..,M, i = b,d,c and j = f,n,p,c.

Let the expression capital T subscript small y comma small i comma small j end subscript superscript 1 comma small a end superscript end expression is equal to the expression capital T subscript small y comma small i comma small j end subscript superscript 1 comma small u end superscript end expression for i = b,d,c and j = f,n,p,c.

Then the expression capital T subscript small y comma small i comma small j end subscript superscript small m comma small a end superscript end expression is equal to expression Max left parenthesis expression capital T subscript small y comma small i comma small j end subscript superscript small m comma small u end superscript comma capital G subscript small y comma small i comma small j end subscript superscript small m comma small r end superscript end expression times expression capital T subscript small y comma small i comma small j end subscript superscript small m minus 1 comma small a end superscript end expression right parenthesis end expression , for m = 2, …, M, i = b,d,c and j = f,n,p,c.

Forecasts:
First define atomistic levels of the forecast variables in a general sense and then define specific forecasts for the general level of all carriers and all routes.

Let the expression capital G subscript small y comma small i comma small j comma 1 end subscript superscript dot comma small a end superscript end expression is equal to the fraction expression begin numerator capital T subscript small y comma small i comma small j end subscript superscript capital Q comma small a end superscript end numerator over begin denominator capital T subscript small y minus 1 comma small i comma small j end subscript superscript capital Q comma small u end superscript end denominator end fraction expression and and the expression capital G subscript small y comma small i comma small j comma 2 end subscript superscript dot comma small a end superscript end expression  is equal to the fraction expression begin numerator summation symbol lower bound small q equals 1 end lower bound upper bound capital Q end upper bound capital T subscript small y comma small i comma small j end subscript superscript small q comma small a end superscript end numerator over begin denominator summation symbol lower bound small q equals 1 end lower bound upper bound capital Q end upper bound capital T subscript small y minus 1 comma small i comma small j end subscript superscript small q comma small u end superscript end denominator end fraction expression

and

the expression capital F subscript small y comma small i comma small j comma small k end subscript superscript small q comma small a end superscript end expression is equal to the expression capital G subscript small y comma small i comma small j comma small k end subscript superscript dot comma small a end superscript end expression times the expression capital T subscript small y minus 1 comma small i comma small j comma dot end subscript superscript small q comma small u end superscript end expression, where i=b,d,c and j=f,n,p,c, k=1,2 and q depends on j. If j = f then q = 3,4 and q = 4 otherwise.

General Level Forecasts for All Carriers and Regions
There are four forecasts for the third quarter of the current year:

Separate Regions (Domestic and International)
the expression capital F subscript small y comma small s comma dot small k end subscript superscript 3 comma small a end superscript end expression is equal to the expression capital T subscript small y comma small c comma small c comma dot end subscript superscript 3 comma small a end superscript end expression plus expression capital F subscript small y comma small d comma small f comma small k dot end subscript superscript 3 comma small a end superscript end expression plus the expression capital F subscript small y comma small b comma small f comma small k dot end subscript superscript 3 comma small a end superscript end expression for k=1,2

Combined Regions
the expression capital F subscript small y comma small s comma dot small k end subscript superscript 3 comma small a end superscript end expression is equal to the expression capital T subscript small y comma small c comma small c comma dot end subscript superscript 3 comma small a end superscript end expression plus expression capital F subscript small y comma small c comma small f comma small k dot end subscript superscript 3 comma small a end superscript end expression for k=1,2

There are eight forecasts for the fourth quarter of the current year:

Separate Regions (Domestic and International) – Separate Carrier Groups
the expression capital F subscript small y comma small s comma small s comma small k end subscript superscript 4 comma small a end superscript end expression is equal to the expression capital F subscript small y comma small d comma small p comma small k end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small b comma small p comma small k end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small d comma small n comma small k dot end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small d comma small f comma small k dot end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small b comma small f comma small k dot end subscript superscript 4 comma small a end superscript end expression for k=1,2

Combined Regions – Separate Carrier Groups
the expression capital F subscript small y comma small c comma small s comma small k end subscript superscript 4 comma small a end superscript end expression is equal to the expression capital F subscript small y comma small c comma small p comma small k end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small c comma small n comma small k dot end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small c comma small f comma small k end subscript superscript 4 comma small a end superscript end expression for k=1,2

Separate Regions – Combined Carrier Groups
the expression capital F subscript small y comma small s comma small c comma small k end subscript superscript 4 comma small a end superscript end expression is equal to the expression capital F subscript small y comma small d comma small c comma small k end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small b comma small c comma small k dot end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small d comma small f comma small k dot end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small b comma small f comma small k dot end subscript superscript 4 comma small a end superscript end expression for k=1,2

Combined Regions - Combined Carrier Groups
the expression capital F subscript small y comma small c comma small c comma small k end subscript superscript 4 comma small a end superscript end expression is equal to the expression capital F subscript small y comma small c comma small c comma small k end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small c comma small f comma small k end subscript superscript 4 comma small a end superscript end expression for k=1,2

The numerical results of the forecasts over the historical period 2002-2005 are given in Tables A.2.1 (third quarter forecasts for foreign carriers) and A.2.2 (fourth quarter forecasts for all carriers) in the Appendix A. Results from Table A.2.1 are presented for completeness but do not enter into the selection process. Three summary measures are given for each forecast in both levels and percentage terms: average error, the standard deviation, and absolute error. The selection decision will be based on measures of percentage error because of the large change in levels with the addition of new-reporting carriers. Due to the small sample size, the forecast is selected based on a subjective evaluation of these measures rather than using formal statistic tests. As an aid to reading Table A.2.1 the best measure for each group of four forecasts varying by time-period and regional grouping is highlighted. The measures in Table 2.2 clearly indicate using a forecast based on growth-rates calculated using the latest available quarter rather than year-to-date as all of the best measures under percentage error fall in this category. Selecting between forecasts based on separate or combined regional groups and separate or combined carrier groups is less clear. However, the evidence slightly favors basing the forecasts on growth rates calculated using separate regional groups and combined carrier groups. Therefore, the selected forecast is:

the expression capital F subscript small y comma small s comma small c comma 1 end subscript superscript 4 comma small a end superscript end expression is equal to the expression capital F subscript small y comma small d comma small c comma 1 end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small b comma small c comma 1 dot end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small d comma small f comma 1 dot end subscript superscript 4 comma small a end superscript end expression plus the expression capital F subscript small y comma small b comma small f comma 1 dot end subscript superscript 4 comma small a end superscript end expression..

The growth rates for each group in this forecast will then be applied to the most recently available fourth quarter (and third for foreign carriers) segment and market data from OAI at the individual carrier and route level. Thus, for the aviation components of the 2006 Provisional Matrix, the missing fourth quarter for 2006 is obtained by applying: the expression capital G subscript 2006 comma small d comma small c comma 1 end subscript superscript dot comma small a end superscript end expression. to all fourth quarter 2005 domestic routes flown by domestic carriers; the expression capital G subscript 2006 comma small b comma small c comma 1 end subscript superscript dot comma small a end superscript end expression. to all fourth quarter 2005 international routes flown by domestic carriers; the expression capital G subscript 2006 comma small d comma small f comma 1 end subscript superscript dot comma small a end superscript end expression. to all fourth quarter 2005 domestic routes flown by foreign carriers; and the expression capital G subscript 2006 comma small b comma small f comma 1 end subscript superscript dot comma small a end superscript end expression. to all fourth quarter 2005 international routes by foreign carriers. Applying these values at the disaggregated level allows the above methods for calculating ports-of-entry, values, and prices to be applied without modification.

3.4.2 Forecasting the Commodity Distribution and Price

The commodity distribution and price are forecast based on historical data from Census. Exports and imports are forecast separately for international shipments and the export distribution and price are then used to forecast domestic shipments. As above, due to the disruptions to the aviation industry, only data for 2002 and later are used as a basis for the forecasts.

3.4.2.1 International Shipments

Census data on imports and exports is the only timely source of data on the value and commodity distribution of air shipments. Historical data from Census is used to forecast the price (value divided by weight) and commodity shares for the fourth quarter of the most recent year. The forecasts are then combined with the aggregate weight forecast from the OAI data and the techniques outlined above for estimating the value and commodity distribution are then applied.

Forecasts use available information to generate estimates of unavailable information. Evaluation of forecast techniques is based on applying the technique to generate historical forecasts which can then be used to calculate errors based on the known values and summarized based on the forecast criterion. The basic philosophy is that the historical 'forecasts' should be generated using only information that would have been available to a forecaster under the same production conditions as future forecasts will be generated. For FAF2 purposes, a forecast for the fourth quarter of 2003 uses only information that would have been available in December 2003 (Census data up to the third quarter of 2003). The evaluation criterion used here is the average squared error for the forecasts of prices and the standard deviation of the forecast errors for the commodity distributions. The reason for the different criterion is that the price forecasts are not necessarily mean zero and the standard deviation would fail to incorporate undesired bias effects.

Three forecasts of prices are considered: year-to-date (YTD) third quarter prices are used for the fourth quarter, annual increases in individual commodity prices based on YTD in the current year and the same period in the previous year, and the annual increase in the price of aggregated commodities based on YTD in the current an previous year is applied to all individual commodities. The second and third forecasts of the price increase are then applied to the individual fourth quarter prices from the previous year. The first forecast will not include seasonal effects on prices while the second and third forecasts will include seasonal effects. The final forecasts of commodity prices may use different techniques for different commodities since seasonal effects may be important for some commodities, but not others, and because forecasts for the individual commodities are independent.

Appendix A Tables 2.3 (a-b) provide summary results for the three forecast techniques across commodities for imports and exports. Appendix A Tables and 2.4 (a-b) provide results across years, by commodity.

3.4.2.2 Domestic Shipments

The export commodity distribution and prices used for the latest provisional year are based on the implied forecast of these values for exports given above.

3.4.3 Conclusion

The aviation portion of the provisional commodity O-D data requires estimating key components that are missing from the available data and forecasting a portion of the data to provide timely information for analysis. The techniques outlined above provide a reasonable approach to filling the data gaps that will provide useful information to users of the database.

6 An adjustment is made to the CFS data to account for observations that have been rounded to zero for either value or tonnage. National level prices are calculated for each commodity based only on observations for which both value and tonnage are greater than zero. The national level price is then used to calculate the missing tonnage (value) by dividing the non-zero value by the price (multiplying the non-zero tonnage by the price).
7 For revised releases, the most recently available data may be used, avoiding the need to use forecasts.
8 Note: The specification is geared towards the usual situation where data is available through September for domestic carriers and June for foreign carriers. In the event the forecast is implemented when fewer or greater months are available then modifications would be required. For growth rates, the latest quarter would refer to the latest available three months. For example, if data is only available through August for domestic carriers, then the latest quarter would be June through August and growth rates would be calculated relative to the same period in the previous year. On the other hand, the base to which the growth rates are applied to generate forecasts consists of the unavailable months. So if data is only available through August, the growth rates are multiplied by tons shipped in the September to December period of the previous year.

Office of Operations