虚拟变量案例:
例1:季节数据模型
我国市场用煤销量的季节性数据(1982-1988,《中国统计年鉴》1987,1989)见下图与表。由于受取暖用煤的影响,每年第四季度的销售量大大高于其它季度。鉴于是季节数据可设三个季节变量如下:
1 (4季度) 1 (3季度) 1 (2季度) D1 = D2 = D3 = 0 (1, 2, 3季度) 0 (1, 2, 4季度) 0 (1, 3, 4季度)
[pic] [pic]
全国按季节市场用煤销售量数据
季度 |Yt |t |D1 |D2 |D3 |季度 |Yt |t |D1 |D2 |D3 | |1982.1 |2599.8 |1 |0 |0 |0 |1985.3 |3159.1 |15 |0 |1 |0 | |1982.2 |2647.2 |2 |0 |0 |1 |1985.4 |4483.2 |16 |1 |0 |0 | |1982.3 |2912.7 |3 |0 |1 |0 |1986.1 |2881.8 |17 |0 |0 |0 | |1982.4 |4087.0 |4 |1 |0 |0 |1986.2 |3308.7 |18 |0 |0 |1 | |1983.1 |2806.5 |5 |0 |0 |0 |1986.3 |3437.5 |19 |0 |1 |0 | |1983.2 |2672.1 |6 |0 |0 |1 |1986.4 |4946.8 |20 |1 |0 |0 | |1983.3 |2943.6 |7 |0 |1 |0 |1987.1 |3209.0 |21 |0 |0 |0 | |1983.4 |4193.4 |8 |1 |0 |0 |1987.2 |3608.1 |22 |0 |0 |1 | |1984.1 |3001.9 |9 |0 |0 |0 |1987.3 |3815.6 |23 |0 |1 |0 | |1984.2 |2969.5 |10 |0 |0 |1 |1987.4 |5332.3 |24 |1 |0 |0 | |1984.3 |3287.5 |11 |0 |1 |0 |1988.1 |3929.8 |25 |0 |0 |0 | |1984.4 |4270.6 |12 |1 |0 |0 |1988.2 |4126.2 |26 |0 |0 |1 | |1985.1 |3044.1 |13 |0 |0 |0 |1988.3 |4015.1 |27 |0 |1 |0 | |1985.2 |3078.8 |14 |0 |0 |1 |1988.4 |4904.2 |28 |1 |0 |0 | |注:以季节数据D1为例,EViews命令是D1= @seas(4)。
1、以时间t为解释变量(1982年1季度取t = 1)的煤销售量(Y)模型如下:
[pic] = 2431.20 + 49.00 t + 1388.09 D1 + 201.84 D2 + 85.00 D3 (1) (26.04) (10.81) (13.43) (1.96) (0.83) R2 = 0.95, DW = 1.2, s.e. = 191.7, F=100.4, T=28, t0.05 (28-5) = 2.07
由于D2,D3的系数没有显著性,说明第2,3季度可以归并入基础类别第1季度。于是只考虑加入一个虚拟变量D1,把季节因素分为第四季度和第一、二、三季度两类。
2、从上式中剔除虚拟变量D2,D3,得煤销售量(Y)模型如下:
[pic] = 2515.86 + 49.73 t + 1290.91 D1 (2) (32.03 (10.63) (14.79) R2 = .94, DW = 1.4, s.e. = 198.7, F = 184.9, T=28, t0.05 (25) = 2.06
进一步检验斜率是否有变化,在上式中加入变量t D1, [pic]= 2509.07 + 50.22 t + 1321.19 D1 - 1.95 t D1 (3) (28.24) (9.13) (6.85) (-0.17) R2 = .94, DW = 1.4, s.e. = 202.8, F = 118.5, T=28, t0.05 (24) = 2.06
由于回归系数 -1.95所对应的t值是 -0.17,可见斜率未发生变化。因此以模型 (2) 作为最后确立的模型。
具体可表示为: [pic] = 2515.86 + 49.73 t (D=0第一、二、三季度)
[pic] = 3806.77 + 49.73 t (D=1第四季度) 若不采用虚拟变量,得回归结果如下,
[pic] = 2731.03 + 57.15 t (4) (11.6) (4.0) R2 = 0.38, DW = 2.5, s.e. = 608.8, T = 28, t0.05 (26) = 2.06
与(2)式相比,回归式(4)显得很差。
例2:用虚拟变量区别不同历史时期 中国进出口贸易总额数据(1950-1984)见上表。试检验改革前后该时间序列的斜率是否发生变化。定义虚拟变量D如下
0 (1950 - 1977) D = 1 (1978 - 1984)
[pic]
中国进出口贸易总额数据(1950-1984) (单位:百亿元人民币)
年 |trade |time |D |time D |年 |trade |time |D |time D | |1950 |0.415 |1 |0 |0 |1968 |1.085 |19 |0 |0 | |1951 |0.595 |2 |0 |0 |1969 |1.069 |20 |0 |0 | |1952 |0.646 |3 |0 |0 |1970 |1.129 |21 |0 |0 | |1953 |0.809 |4 |0 |0 |1971 |1.209 |22 |0 |0 | |1954 |0.847 |5 |0 |0 |1972 |1.469 |23 |0 |0 | |1955 |1.098 |6 |0 |0 |1973 |2.205 |24 |0 |0 | |1956 |1.087 |7 |0 |0 |1974 |2.923 |25 |0 |0 | |1957 |1.045 |8 |0 |0 |1975 |2.904 |26 |0 |0 | |1958 |1.287 |9 |0 |0 |1976 |2.641 |27 |0 |0 | |1959 |1.493 |10 |0 |0 |1977 |2.725 |28 |0 |0 | |1960 |1.284 |11 |0 |0 |1978 |3.550 |29 |1 |29 | |1961 |0.908 |12 |0 |0 |1979 |4.546 |30 |1 |30 | |1962 |0.809 |13 |0 |0 |1980 |5.638 |31 |1 |31 | |1963 |0.857 |14 |0 |0 |1981 |7.353 |32 |1 |32 | |1964 |0.975 |15 |0 |0 |1982 |7.713 |33 |1 |33 | |1965 |1.184 |16 |0 |0 |1983 |8.601 |34 |1 |34 | |1966 |1.271 |17 |0 |0 |1984 |12.010 |35 |1 |35 | |1967 |1.122 |18 |0 |0 | | | | | | |
以时间time为解释变量,进出口贸易总额用trade表示,估计结果如下:
trade = 0.37 + 0.066 time - 33.96D + 1.20 time D (1.86) (5.53) (-10.98) (12.42)
0.37 + 0.066 time (D = 0, 1950 - 1977) = - 33.59 + 1.27 time (D = 1, 1978 - 1984)
上式说明,改革前后无论截距和斜率都发生了变化。进出口贸易总额的年平均增长量扩大了18倍。