Step 7.1.: Process Warranted and create WarrantCount in df_warrants
1. define extract_consumer_count - extracts the `consumerCount` value from the `treatment` column 2. apply extract_consumer_count to create a new `consumerCount` column 3. define generate_warrant_count - generates a `WarrantCount` list based on the 'Warranted' column - if Warranted is None, return None - if Warranted is a list, generate a list of counts based on the boolean values 4. apply generate_warrant_count to create a new `WarrantCount` column: df_warrants["WarrantCount"] = df_warrants.apply(lambda row: generate_warrant_count(row["Warranted"], row["consumerCount"]), axis=1)
Sampled columns
Step 7.2.0.: Combine df_warrants and df_reputation into df_ads
# Strip spaces from column names and standardize the format df_reputation.columns = df_reputation.columns.str.strip() df_warrants.columns = df_warrants.columns.str.strip() # Identify the common columns between warrants and reputation df common_cols = list(set(df_reputation.columns) & set(df_warrants.columns)) # Combine all ads into a single dataframe df_ads = pd.concat((df_warrants[common_cols], df_reputation[common_cols]), axis=0)
Step 7.2.1.: Process the ProductCost and ProductPrice columns to count the occurrences of low quality, high quality, not warranted, and warranted products
# Initialize lists to hold the counts low_quality_count = 0 high_quality_count = 0 not_warranted_count = 0 warranted_count = 0 # Iterate through the ProductCost and ProductPrice columns to count the occurrences for product_cost, product_price in zip(df_ads['ProductCost'], df_ads['ProductPrice']): if isinstance(product_cost, str): product_cost = ast.literal_eval(product_cost) # Convert string to list if isinstance(product_price, str): product_price = ast.literal_eval(product_price) # Convert string to list
# Count occurrences of 2 (low quality) and 6 (high quality) in ProductCost if isinstance(product_cost, list): low_quality_count += product_cost.count(2) high_quality_count += product_cost.count(6)
# Count occurrences of 10 (not warranted) and 12 (warranted) in ProductPrice if isinstance(product_price, list): not_warranted_count += product_price.count(10) warranted_count += product_price.count(12)
# Print the counts
Sampled columns
Step 7.2.2.: Combine df_warrants and df_reputation into df_ads
# Strip spaces from column names and standardize the format df_reputation.columns = df_reputation.columns.str.strip() df_warrants.columns = df_warrants.columns.str.strip()
# Identify the common columns between warrants and reputation df common_cols = list(set(df_reputation.columns) & set(df_warrants.columns))
# Combine all ads into a single dataframe df_ads = pd.concat((df_warrants[common_cols], df_reputation[common_cols]), axis=0)
# Set RatingIndicator to 1 if the consumer has any good or bad ratings at all df_ads.loc[(df_ads['GoodRatings'] greater than 0) | (df_ads['BadRatings'] greater than 0), 'RatingIndicator'] = 1
To ensure consistency between the df_reputation and df_warrants DataFrames, we rename columns to have the same names.
Step 7.0.1: Split participantID to PlayerID (consumer) and ProducerID (producer)
We split the participantID column into two new columns: Where PlayerID contains IDs of consumers (players) and ProducerID contains IDs of producers (sellers). This is done by filtering out the role column.
Step 7.1.: Process Warranted and create WarrantCount in df_warrants
We process the Warranted column in df_warrants to create a new column, WarrantCount, which counts the number of warranties for each round.
Extracting consumerCount from the treatment column gives us the number of consumers in the round. After that we generate WarrantCount based on the boolean values in Warranted, the list contains True then WarrantCount is updated.
We standardize column names and identify common columns between the two DataFrames. Then, we concatenate them into a single DataFrame, df_ads.
Step 7.2.1.: Process the ProductCost and ProductPrice columns to count the occurrences of low quality, high quality, not warranted, and warranted products
We count occurrences of the Low-quality products (ProductCost = 2), High-quality products (ProductCost = 6), Not-warranted products (ProductPrice = 10), Warranted products (ProductPrice = 12). We do this by counting the number of product cost and product price according to the above criteria.
We create a new column, RatingIndicator, which is set to 1 if a consumer has any good or bad ratings, and 0 otherwise. This is to check if any ratings were given