I. Introduction and Background History

The overall increase in the popularity of the Internet in recent years has carried along with it the growth of Internet auctions as a means of exchanging goods.  E-bay, Amazon, and Yahoo have emerged as the leaders in this still relatively new consumer distribution channel.  These sites sell everything from toys to clothes to car parts and carry almost all types of consumable good.  Online auctions have greatly reduced the costs associated with product distribution and thus, are trading billions of dollars worth of auctions every year. 

  The sports cards industry is one that has been greatly affected by the introduction of the Internet both positively and negatively.  While there have always been large scale nationwide businesses which take over multi-page ads in magazines such as Beckett and Tuff Stuff, a large majority of sports card business are small mom-and-pop type businesses.  These shops/stores tended to do a majority of their business locally and thus, their sales were based on the overall regional demand for a product.  The extremely low cost and conveniences of Internet auctions have given these small stores a chance to compete against the larger regional/national sports cards outlets in that now their (smaller shops) markets have grown considerably.  Their demand is now essentially the same as the larger shops, giving them a chance to compete for customers even though they do not have the money required for advertising and promotion of their store.  Boxes that might have gone unsold can now be sold to someone somewhere who was not previously accessible.  Before Internet auctions the supply facing a particular customer looking for a box was fixed based on where he lived; he only had access to a small portion of the total market supply.  After Internet auctions started to take off though, a customer has hundreds more opportunities for buying a product and now he has access to a much larger portion of the total supply (though the total supply is essentially fixed).  As a result, the market on Internet auction sites has become flooded and there is a tendency for sellers to continuously compete against each other by lowering prices even below production cost.  While this is good for the consumer, it obviously has negative effects on the seller and is the major impacting factor on the local smaller stores. 

For example, say a local store was charges $40 for a box of baseball cards.  Before Internet auctions, a customer had few options:  he could buy the box from the store, look at other stores in the area, or pursue national shops.  Essentially his options were very limited and the local card shops knowing this had a reason to collude together (essentially forming an oligopoly) in setting their prices.  However, now with the introduction of Internet auctions, the same card shop is charging $40, but the customer can go online and find many more boxes that are cheaper than that at the local shop.  One prevalent theory is that these Internet auctions will drive local shops out of business forcing them to jump on the auction bandwagon if they want to continue. 

The major reason that the local shops can be undercut so effectively is that the use of fraud in obtaining access to product order information from sports cards companies has greatly increased.  For example, the owners of a sports card shop are generally required to show proof of an actual shop in order to receive product order information from companies.  This includes gas/heat receipts, rent receipts, and copies of other shop expenses.  This is to ensure that only those have legitimate shops get cards not those who do not have shops (and thus do not have the costs associated with them) and can thus unload the product for much cheaper than the shops.  The ease and convenience of Internet auctions has made this problem much worse, as people see an even easier way to exploit the system. 

Another factor affecting Internet auctions is time.  Auctions can last anywhere from 3 to 10 days and when added to the time for the seller and buyer to contact each other, payment to be received and the product to be shipped, it can often be several weeks before a buyer gets his box of cards.  Related to this is the fact that people may be watching auctions for a week to win a box and then lose out at the last minute, requiring them to wait at least another day to win a box.  The need for someone to have a box by immediately or by a certain date is still an advantage that local card shops possess over Internet auctions (Note: the introduction of Pay Pal (essentially an e-money transfer service where people can send money to one another by e-mail address) has greatly streamlined the payment system in the last 6 months). 

Another major drawback of Internet auctions has to do with fraud and the obvious communication gap between buyer and seller that doesn’t exist when a product is sold in person.  Bidders can find themselves sending money for a product they never receive whereas sellers can receive bidders who never send payment.  This, when combined with the time factor of auctions mentioned previously, adds a sort of uncertainty cost to each auction.  Buying a box in a local shop might cost more in terms of sheer money paid for the box, but when time costs of waiting for an auction to end and to receive the product are factored in the result could be very different.  Thus, there are conflicting views on the effect of Internet auctions on sports cards shops and I feel it is important to study the Internet auctions of sports cards to see if there are factors that can be controlled to affect the price of a box of cards.   

 

 

II. Description and Reasoning of Model

To start I chose four different boxes to study; A box is defined as a sealed hobby box of the particular product.  A hobby box is the type of box made available to dealers as compared to a retail box, which are found in retail stores such as Wal-Mart, K-Mart, and Dick’s.  The number of packs per box and cards per box varies among the four different boxes.  I tried to pick boxes which were fairly active and that were coming out during the time of my study so I could examine the prices before and after the release.  Topps Baseball Hobby (TH) (released 11/20/00) and Fleer Tradition Hobby Baseball (FT) (released 2/7/01) are the base products from Topps and Fleer Co. These products are both priced relatively low ($1.29 and $1.50 per pack suggested retail price), produced in the greatest amount from their respective companies, and contain 36 packs per box.  Topps Heritage Hobby Baseball (THH) (released 1/26/01) and Fleer Triple Crown Hobby Baseball (FTC) (released 1/26/01) are mid-range ($4.00 and $2.50 per pack suggested retail respectively), newer products with 24 packs per box.  [Note: for the rest of the paper the boxes will be referred to by their abbreviations]  

E-bay was chosen as the particular auction site to study, instead for Yahoo or Amazon, for several reasons.  The first has to do with the probability of an auction actually yielding a transaction.  Since Yahoo charged no listing fees (E-bay charges $.25 to $2.00 depending on the starting or reserve price) there is less motivation for a seller to ensure that an auction results in an actual transaction for a product since it can re-listed (re-advertised) at no additional monetary cost (Note: within the last month Yahoo has changed their structure and now does charge fees on listings of their auctions).  The sellers can put high reserve prices on their auctions and hope that someone will take it, realizing that if the exact same scenario had happened on E-bay they would have paid money whereas on Yahoo they pay nothing.  Thus, E-bay and Amazon sellers have more of a reason to list items that will get sold the first time they are listed (and not repeatedly re-listed).  Consequently, according to one study during the summer of 1999, E-bay’s had 54% of all auctions result in a sale, whereas Amazon and Yahoo were at 38% and 16% respectively (however, it should be noted that during the time of this study Amazon had a special on auction listing rates; they were all reduced to $.10 from the standard $.25 to $2.00 and this could be the reason for the lower % of auction sales as compared to E-bay).  E-bay was also chosen because of the much larger volume (both in terms of # of auctions and $ of total sales) on it and thus, much larger possible sample size.  The same summer of 1999 study estimated E-bay to have an average of 340,000 auctions closing per day and revenues of $190 million per month.  Amazon and Yahoo, on the other hand, had 10,000 and 88,000 auctions per day respectively and generated $2 million and $19 million per month, respectively. 

I began collecting data by searching through completed items on E-bay starting on January 3, 2001, for each of the four different boxes.  I printed out these sheets and began to manually enter the data for each auction into Excel so that it could be analyzed and studied.  When the auctions were listed after a search, the auction number, title, end price, number of bids, and end time were listed for each item.  I decided to use all of these as independent variables in the regression that was modeled as such:

 

yi = a + b1X1i+ b2X2i + b3X3i + b4X4i + b5X5i + ei

 

The yi represents the end price of the ith auction and is the dependent variable.  The first term, a, gives the intercept of the equation, or the value that would be obtained for a price if all of the X i’s were 0.  This will have little significance in the model due to the fact that an item cannot be sold with 0 bids and also the structure of the other variables.  X1i represents the number of bids on the ith auction and can take on any integer value greater than zero.  Its coefficient represents the marginal change in the end price of an auction that would be expected by having an increased number of bids on an item.  The number of bids was included so as to look for any difference between items that are sold using Buy It Now and sold normally (Note I am assuming Buy It Now auctions are those with 1 bid; this will be discussed later).  The remaining coefficients are of the same nature; all representing the marginal increase (or decrease) in the price of an auction resulting from the changing the end time, end day, and number of similar items (which essentially involves changing one of the previous two elements).  The next variable, X2, represents the day of the week the item ended on and can take on values from 1 to 7 representing Sunday through Saturday.  This was included to try and look for any difference not only between particular days of the week, but also between groups of days such as weekdays versus weekends. X3 is a variable representing the number of similar items, which I defined to be the number of items of the same type listed within 30 minutes before and 30 minutes after an auction.  Theoretically, when there are multiple items listed around the same time, that the prices should tend to be lower because the supply is greater (if there are 10 people bidding on 1 box its price should be higher than 10 people bidding on 5 boxes because each individual box is in less demand).  However, it may be the case that when there are more items listed, there is more bidders looking at them and thus, the opposite may occur.  Multiple items listed together may generate more interest on the bidder’s end thus increasing the price.  Another factor considered in the regression was the ending time of the auction, X4, (equivalent to the start time).  The value was entered as whatever number hour of the day it was in; thus, possible values are from 1 to 24 (i.e. an auction that ended at 0:30 [12:30 a.m.] occurs in the 1st hour of the day.  Therefore an auction that ended at 16:56 would get a value of 17).  All the times were kept in Pacific Standard Time, which is the time that E-bay uses to classify auctions. It seems that sellers should want to list auctions during times when the greatest number of bidders will look at them (i.e. in the evenings for all parts of the country).  Thus, by having a larger market of bidders the price should be higher.  However, auctions listed at times in which there are very few potential bidders (early morning or midday) may draw higher prices because when someone sees an item that he cannot bid on right before the end of the auction he may enter an artificially higher price (than he would have otherwise entered) during his last possible chance to bid to ensure a higher probability of winning the item.  The final variable, X5, represents whether or not the item sold was a pre-sell and takes on values of 0 for normal sale and 1 for a pre-sale (i.e. an auction which was completed before the day the product was released to the public).  This was included to see what kind of impact selling items before they actually came out had on an auction’s price.  The last term in the equation, ei , represents the error term of the ith auction.  It represents the error or difference of the ith auction’s price from the predicted value.

 

III. Data Description and Model Estimation

Once a reasonable amount of data was entered (193 entries for FTC, 238 for TH, 305 for THH, & 339 for FT) for each of the four types of cards.  I decided to look at some averages for different groups of SimilarItems, EndDay, NumberBids and EndDay.  And first compared the means of auctions with no similar items and those with at least one.  The results were split among the four boxes:  two boxes (TH & FT) had higher prices for those with no similar items and the other two lower prices.  The range of the differences between the two groups was $.32 to $2.03.  Next, I broke the SimilarItems variable into two smaller groups:  those which had 0 or 1, and those with >1 to look for something more conclusive.  For each of the four boxes, the price of those with SimilarItems(0,1) was higher than those with >1 similar auctions.  The range of prices was quite small ($.21 - $.77), but it did appear that multiple items listed at the same time did suffer from an “excess supply” effect lowering their prices.  I carried the comparison out one item further by examining the price for items with SimilarItems(0,1,2) and those with SimilarItems(>2).  The results shifted even more in the direction of a price decrease due to additional items.  For all four boxes the difference between the average prices was fairly significant ($.61 to $3.75).  It should be noted though that the number of boxes with SimilarItems(>2) was generally much less than those with SimilarItems(0,1,2) (Topps Hobby only had 6 boxes with SimilarItems(>2) ) and thus, the results are not necessarily that reliable.   

Next, I looked at the number of bids (NumberBids) to possibly see the extent of the “Buy-It-Now” feature on auctions.  I hypothesized that auctions ending with one bid were of this variety and those with more than one bid were not.  While this conclusion may seem a little haphazard, gazing over a majority of auctions, I found that the large majority of those ending with one bid were “Buy-It-Now” ones.  The prices for those auctions ending with one bid were higher for TH, THH, and FTC and slightly lower for FT.  The range of the average prices was comparable to the range of the other variables: $-.12 (FT) to $2.25 (FTC).  It seems that the “Buy-It-Now” feature essentially offers people a higher price in order to ensure the immediate winning of an auction, which reduces waiting time.  A bidder can guarantee him an auction by bidding the specified price and also, end the auction immediately getting the product into his hands faster.  Thus, it would seem to make sense that auctions with this feature would carry a higher price:  a premium being paid for the above mentioned reasons.    

After this, I examined the EndTime variable by splitting it into several different time blocks and looking at the averages.  I constructed the times blocks as such: two “prime” time blocks and two non-prime blocks.  The prime time block should be split up into times which are “prime” for everyone and only “prime” for certain time zones. A prime block or chunk of time is a time when there is a potential for a much larger audience (i.e. in the evenings after work, but before a common time when people traditionally go to sleep).  I chose 4 – 8 p.m. PST (17 – 20 in the model) as a prime block for all time zones (since its range [4 p.m. - 12 p.m. depending on the time zone] is generally work free for everyone) and 8 – 11 p.m. PST (21-23) as a partial prime block in that the eastern half of the United States will have considerably fewer bidders but the mountain and western time zones will still have a potentially large audience.  The other two time blocks I examined were 11 p.m. – 9 a.m. PST (24, 0-9) (a completely non-prime block for everyone) and 9 a.m. – 4 p.m. PST (10-16).  The means for each of these different time blocks was computed for all four products.  The 10-16 block was the clear consensus as the time block giving the highest price.  The lowest mean box price also had some correlation with 3 of the 4 boxes showing the lowest mean price in the 21-23 time slot; the other box (FT) had its lowest price in the 17-20 time slot.  The differences in means ranged from $.89 to $2.04.  Also, in 3 of the 4 boxes the 17-20 time slot was the lowest or next lowest average price which indicates that a larger potential audience might not lead to a higher price.  Instead it appears people might bid higher prices on auctions that run during the day or late night (i.e. auctions that end at times which they can not physically watch and thus, decided whether or not to increase their bid at the last minute) in order to ensure that they win them.  The 17-20 had the greatest number of auctions listed in it for each of the 4 boxes, which indicates some correlation with the number of similar items.  Thus, the lower price for that time slot may have been due to the fact that there were so many more auctions listed during it (because of the inherent fact that there is a larger audience) and thus, there all these additional boxes decreased competition and lowered the price.  The 10-16 time slot did have the second largest number of listings in 2 of the boxes (and 3rd largest number in another) which would seem to indicate the opposite.  There definitely appeared to be a definite relation between the end time of an auction and its end price.

  The day an auction ends on (EndDay) is another important variable that the seller has complete control over.  The results for the four boxes were varied among almost every day and the range of the prices between the highest and lowest averages was much larger.  Two boxes (TH and FTC) had Thursday, THH had Tuesday and FT had Monday giving the highest mean price.  All of these were weekdays, which did show somewhat of a trend as compared to the days that gave the lowest average price.  They were Tue., Wed., Sat., and Sun. for TH, THH, FTC, and FT respectively.  The price differences between the highest and lowest averages ranged from $1.28 to a staggering $12.01 for THH.  It seems that during the week there might be a larger potential audience than the weekend due to weekend activities or vacations.  This could possibly explain why the highest average prices occurred during the week instead of weekend.  Again there could be some correlation to the number of similar items listed in that Saturday might have many more listings than say Monday because sellers think that they will have a larger audience and thus, get a higher price by bidder competition.  However, these additional listings could be responsible for the decreased price and happen to be on Saturday (due to the psychological beliefs of sellers) causing Saturday’s average price to be lower.  Since, the average prices for the days did not display any type of overall increasing or decreasing trend for weekdays versus weekends (the average prices went up and down between all the different days of the week), it did not appear that there was a definite trend even for larger possible blocks of days.

Having seen the trends that emerged for the different variables, I ran regressions on the each of the boxes to see if they showed the same results.  Initially I chose a null hypothesis (Ho) of each coefficient being 0 individually (bi* =0) and H1 being the hypothesis that a coefficient did not equal 0.  The p-values calculated from the regression in Excel gave the significance level of each coefficient.  The p-values are between 0 and 1 and represent the percent level of significance for the value of a coefficient.  A p-value of .0654 means that the coefficient is significant at the 6.54% level.  In other words, if the null hypothesis was rejected and it was assumed the coefficient was equal to its calculated value, this assumption would be correct 100*(1 – p)% of the time.  Thus, in order to conclude that an individual coefficient is statistically significant at the 5% level, the p-value * 100% must be less than 5% (the p-value should be < .05) [Note:  I assumed a 5% level of significance.  For the rest of the paper whenever the significance of a coefficient is mentioned it is assumed to be at the 5% level.]. I also looked at the multiple coefficient of determination (R2) to determine the percentage of the variance in the price determined by the independent variables in the model.  (R2)*100 gives the percentage of the variance in prices due to the regression.  For example, if R2 equaled .37 then 37% of the variability in an auctions final price is due to the variables I chose to examine (EndDay, EndTime, etc.) and the rest is due to random fluctuations or factors not included in the model.  Thus, a higher R2 value means the variables in the model explains more of the variance in price and the model is a better fit than a lower R2 value.     

All the coefficients for each of the individual regressions were examined for statistical significance at the 5% level.  In the Fleer Tradition regression, all the coefficients except that of EndDay (b2) were found to be significant not only at the 5% level but also at the 1% level.    The p-value for EndDay was .1105 indicating it would be significant at about the 11% level.  This is much higher than 5% and thus, I had to accept the null hypothesis (i.e. that b2 equaled zero).  Every coefficient except b5 was fairly small in terms of affecting price;  the range for b1-b4 was -.27 to .26 indicating that changing any of the variables X1-X4 by an increment of 1 (while keeping every other variable fixed) would result in a small increase or decrease in box price.  For example, b3   equaled –.27 and this indicates that having one more similar item being auctioned off at the same time period results in an expected price decrease of $.27.  b5, in sharp contrast to the other coefficients, had a fairly large value (-4.81) and since it only has two possible values (1 for pre-sell, 0 for regular sell), this indicates that boxes sold before the release date of the product were sold for $4.81 less on average than those sold after it.  The R2 value was .037 and the adjusted R2 (the R2 adjusted for the number of predictors in the model) was .297 indicating that only about 30% of the variance in an auction’s price was due to the factors I examined.  Thus, the overall effectiveness of the model in this regression was fairly poor.  The results from the Fleer Triple Crown regression were almost completely inconclusive.  Every coefficient except b5 had large p-values (.32 to a staggering .99) and thus, they all were rejected individually.  The range of these four coefficients was from -.05 to .27 also indicating that not only were they statistically insignificant, but they were also fairly insignificant in terms of affecting the price of a box.  Again, b5 had a fairly large value (3.78) and was statistically significant at the .02% level.  The price of an average box should be almost $4.00 higher if sold during the pre-sell time period than one sold after it, in direct contrast to a Fleer Tradition box.  The R2 and adjusted R2 values were .082 and .058, respectively.  These values were both extremely low and according to the second one, about 94% of the variation in a box’s price was not explained by the model.  The Topps Heritage Hobby regression was not much more conclusive than the Fleer Triple Crown one.  b1 and b5 were the only 2 coefficients found to be statistically significant in the model.  The other three coefficients had levels of significance from 26% to 66%.  As in the two previous regressions, all of the coefficients except b5 were fairly small (-.13 to .31).  b5 was a staggering -19.37 indicating a difference of almost $20 between a box sold before and after the release date (all other factors held constant).  This regression was the only one that had an adjusted R2 value (.74) that gave some validity to choosing the variables I did to explain an auction’s price. The Topps Hobby regression coefficients were somewhat insignificant in terms of the magnitude of an auction’s price change and two of them (X2, X3) were not significant at the 5% level (Note: since all auctions took place after the release date, there were no pre-sells and thus, this variable was eliminated from the model).  The range of the four coefficients was -.14 to .16 again indicating a small effect on price.  The adjusted R2 value was an extremely low .065 again indicating that the variables in the model explained very little of the variation in price. [Note:  all the coefficients, statistics, and aveage prices can be found after the paper in the different tables]. 

Overall, the regressions generally seemed to indicate that the factors I had selected as possible ones for influencing a box’s price had very little influence (especially when considering the low adjusted R2 values).  This was in direct contrast to most of the findings from the averages of the different values of the variables.  The NumberBids coefficient was significant in three of the regressions and for these three its range was $.16 to $.31.  Thus, inciting a significant number of extra bids on auction could raise its price slightly, although the problem lies in that the number of bids on an auction is one factor that the seller has a smaller amount of control over.  Listing an auction with an initial price very close to the end prices of other auctions would seem to result in fewer bids as compared to starting an item at a very low price.  However, an item could get quickly bid up to a price near the current going price of other auctions resulting in a small number of bids even for a low starting price.  Also, with setting the starting price very low the seller faces the risk of not having the product reach the price that other auctions are going for.  The results from the averages for auctions with one bid versus those with more than one bid appeared to show that products auctioned off with the “Buy-It-Now” feature (one bid) command a premium.  The assumption made (that auctions with one bid ended with “Buy-It-Now”) is a fairly accurate one verified by looking at any sample of baseball card auctions on E-bay.  Also, it seems highly unlikely that auctions with more than 1 bid would end with “Buy-It-Now” because the main advantage to winning one of these auctions is the fact that it ends immediately and the buyer does not have to wait extra time.  So an initial bidder would have just as much reason to bid on another auction if the price he is bidding is below that of the “Buy-It-Now” price especially since he knows that someone will most likely come along and bid the “Buy-It-Now” price; thus, his chances of winning the auction are fairly slim so it seems likely that few “Buy-It-Now” auctions end with more than one bid.  In general, it seems that the number of bids on an item is a factor that can be somewhat (although not totally) controlled to affect the price of an auction.

The EndDay coefficient was not statistically significant in any of the four auctions.  This does not mean that an auctions end day does not affect an auctions price, but that its effect cannot be determined with enough certainty in the model.  The averages for each day for all the products computed before the regressions seemed to indicate this might happen. The design of the EndDay variable (1-Sunday to 7-Saturday in chronological order) meant that the coefficient would give the change in box price expected from moving of the start of the week to the end of the week.  However, the average prices for each day did not move in any type of general upward or downward pattern for any of the four boxes.  Although each of the boxes had its highest average price fall on a weekday (Monday - Thursday), two of them also had their lowest average price on a weekday.  While the differences between the highest and lowest prices were definitely significant in terms of price, there did not appear to be any conclusive way of determining why a certain day gave a higher price than another one.  There are obvious differences between groups of days such as weekdays versus weekends, but there doesn’t appear to be any major differences between say Tuesday and Thursday or Monday and Wednesday.  Yet for Topps Hobby and Fleer Tradition these days were the highest and lowest average prices, respectively.  Thus, I concluded that although the end day of an auction can definitely affect its price, I could not determine how specifically it does (i.e. any type of pattern in predicting which days would give a higher price.)  Perhaps, some sort of model could be constructed that compares the prices of every end day for different products, but also factors in the difference between the corresponding levels (so that a difference that is much more significant in terms of price gets much more significance in terms of weight).  In other words, a difference between the highest and second-highest average prices of $10 for a box should get more importance than the difference between highest and second-highest average prices of a box of $2.  Currently, it seems that average prices for a day are due more to randomness than anything and the way the average prices moved among the days for FTC, FT, and THH all exhibited this. 

The number of similar items being auctioned off in a similar time period for a box is something that would seem to definitely determine the price in one way or another.  However, in three of the four regressions, it was statistically insignificant indicating that it still might affect an auctions price, but we can not determine this with enough certainty or there might be some sort of non-linear relationship between the number of similar items and the end price.  For example, in cases where SimilarItems is small, adding an additional similar item might increase the price, whereas when it is large, the addition of a one box to the auctions would decrease the price, or vice versa. When the averages for auctions with SimilarItems(0,1) was compared to those with SimilarItems(>1), the (0,1) case (those auctions with fewer similar items) gave a higher price for each of the four boxes.  Comparing the auctions with SimilarItems(0,1,2) to SimilarItems(>2) showed the same general trend with a larger effect.  Thus, it definitely seems that more similar items in the time period a box is being auctioned off causes the price of a box to decrease.  This appears to most likely be due to there are more boxes available to bid and thus, bidders do not need to compete against each other as much to win a box. 

The EndTime variable is one of the few variables in the model that the seller has complete control over.  Thus, it might seem likely that there should be some sort of definite trend for it.  Its coefficient was not significant in half of the regressions.  In the two that it was (FT and TH), it was negative both times, which backed up the results found from the averages of the time blocks.  For each of these, the averages of the two latest time blocks (17-20, and 21-23) were less than that of the first two (24,1-9) and (10-16).  Thus, there appeared to be a general inverse relationship between the end time and the auctions end price.  However, the averages for the time blocks gave the 10-16 block as the one with highest price (something not apparent from the regression model).  The way the times were recorded by each hour was done to be able to see the significance of one hour versus another, but the regression data and the averages calculated for the time blocks showed that this comparison was not significant and instead it was groups of time that were significant in determining a box’s end price.  This was most likely due to the fact that there was not a linear relationship between the end price and end time since for each of the boxes the average price went up then down then up or down in going from one time block to another in chronological order (from the start of a day to the end of it).  It definitely appears that products listed earlier in the day ended with higher prices than those set to end later in the day.  A factor complicating this though is the “Buy-It-Now” feature on auctions that causes an auction to end immediately.  Thus, an auction put on at 9 P.M. might end at 2 P.M the next day when several other auctions do.  Thus, the others would go down as having an additional similar item whereas this item was not scheduled to end at that time so it wasn’t in direct competition with them.  Thus, an item with more similar items than it should really have would get classified as such and its higher price would go in the SimilarItems(3) category instead of SimilarItems(2).  Thus, the average of SimilarItems(2) is slightly lower than it should be and SimilarItems(3) slightly higher.  This could possibly cause the effect of the number of similar items on an auctions price to be even larger than found.

While the seller has complete control over the end time of an auction (unless it ends with the “Buy-It-Now” feature) there definitely appears to be some correlation between this and the number of similar items listed.  A seller has some control over the number of similar items (he can look at auctions posted within in the previous thirty minutes), but probably has very little idea of how many are going to get posted in the next thirty minutes.  More auctions are listed in certain times of the day due to people’s perceived ideas about how the end time of an auction affects its price.  Thus, was the highest average price for auctions that ended between 12-7 P.M. EST due to some unknown characteristic of that time of the day or instead possibly due to the fact that a smaller number of auction’s are ending during those times (and thus, each auction has a greater probability of having a lower number of similar items which could cause a higher price).  To examine this further, I looked at the average number of similar items for each of the four different time blocks.  If the number of similar items is the major factor in determining price, then the times with higher prices should have a lower average number of similar items.  The TH averages were all very close to each other and they were all < 1, so very little could be concluded.  The other three boxes had a difference of at least 1.5 items between the highest and lowest average.  For FTC, THH, and FT, the 17-20 time period had the highest average number of similar items fitting into the theory that the lower prices received during that time could be due to the fact that there are simply more items being auctioned off against each other.  However, the 10-16 time period had the second highest average for all three higher than the 24,0-9, and 21-23 time periods which both had lower average prices.  The number of similar items listed throughout a day appeared to increase until it reached a time late in the night when it decreased most likely because most sellers believed that the audience had decreased (due to people going to sleep, etc) significantly resulting in fewer boxes being listed and thus, fewer similar items per box.  Overall, there appeared to be a slight correlation between the two variables, but it seems that the differences in price for the time periods was more likely due to some unknown feature of them or pure randomness.    

Two of the three boxes for which I had pre-sell auction prices showed an increase in price after the release date (i.e. a negative b5 coefficient since 1 stood for a pre-sell and 0 stood for regular sale).  FTC, on the other hand, had a lower average price for those boxes sold after the release date.  There doesn’t seem to be any definitive conclusion of the effect on the price of selling a box before it is released.  A products price will generally increase after it has been released if some player or particular subset in the product becomes extremely popular driving up the demand and thus, the price.  This definitely appears to be what happened to the THH boxes as the boxes sold during the pre-sell period were on average $20 less than those sold after its release. 

 

IV. Conclusion

The auctioning of baseball cards boxes on E-bay can be a very profitable profession.  Can the price of any particular box be modeled due to the end time of the auction, the number of bids, the number of similar items, the end day of the auction, and whether the box is a pre-sell or not?  In terms of a linear model, the answer appears to be a resounding no.  Thus, either factors other than the ones I examined affect a boxes price or the assumption of a linear model is not correct.  The latter appears to be a better explanation since there definitely appeared to be some effect on a boxes price due to each of the variables when they were looked at in larger blocks rather than the small incremental changes studied by the regression.  The linear regression model looks more at incremental changes between smaller segments of data whereas the averages I computed looked at broader segments.  Although, the examination of these broader segments showed significant differences for all the variables, most of these differences could not be generalized or explained.  The trends appear to be due to some other factors or pure randomness in fluctuations of the auctions.  For further research, I would use a larger sample space and also, try to create some sort of model to generalize the end time and end day variables as I talked about earlier.  This might be some sort of ranking structure that included a weighting system accounting for the largeness or smallness of the difference between days or times to see which days give higher prices.  With a much larger sample space, after the best and worst days and times are determined, the average prices of products listed during these days and times could be compared to look for a significant difference between the best possible price and worst possible price (according to these two variables).  The small sample space for each box only gave anywhere from 3 to 20 entries for a specified day and time which was not quite enough to compare for anything conclusive.  The study of these baseball cards auctions has been a very interesting one and unfortunately there were only a few significant findings, which a seller could use to his advantage to obtain a higher price on his auction. 

1