Step 7 - Extract reputation and warrants dataframes

Step 7

Combine reputation and warrants dataframes to create df_ads

Step 7.0.0: Rename columns appropriately

df_warrants.rename(columns=('productQuality': 'ProductQuality', 'score': 'Scores', 'brandName': 'BrandName', 'warrants': 'Warranted', 'changedBrand':'BrandChanged'), inplace=True)

df_reputation.rename(columns=('productQuality': 'ProductQuality', 'score': 'Scores', 'brandName': 'BrandName', 'warrants': 'Warranted', 'changedBrand':'BrandChanged'), inplace=True)

Step 7.0.1: Split participantID to PlayerID (consumer) and ProducerID (producer)

df_reputation['PlayerID'] = df_reputation['participantID'].where(df_reputation['role'] == 'consumer')
df_reputation['ProducerID'] = df_reputation['participantID'].where(df_reputation['role'] == 'producer')
df_warrants['PlayerID'] = df_warrants['participantID'].where(df_warrants['role'] == 'consumer')
df_warrants['ProducerID'] = df_warrants['participantID'].where(df_warrants['role'] == 'producer')

Step 7.1.: Process Warranted and create WarrantCount in df_warrants

1. define extract_consumer_count - extracts the `consumerCount` value from the `treatment` column
2. apply extract_consumer_count to create a new `consumerCount` column
3. define generate_warrant_count - generates a `WarrantCount` list based on the 'Warranted' column
- if Warranted is None, return None
- if Warranted is a list, generate a list of counts based on the boolean values
4. apply generate_warrant_count to create a new `WarrantCount` column:
df_warrants["WarrantCount"] = df_warrants.apply(lambda row: generate_warrant_count(row["Warranted"], row["consumerCount"]), axis=1)

Sampled columns Step 7 Illustration

Step 7.2.0.: Combine df_warrants and df_reputation into df_ads

# Strip spaces from column names and standardize the format
df_reputation.columns = df_reputation.columns.str.strip()
df_warrants.columns = df_warrants.columns.str.strip()
# Identify the common columns between warrants and reputation df
common_cols = list(set(df_reputation.columns) & set(df_warrants.columns))
# Combine all ads into a single dataframe
df_ads = pd.concat((df_warrants[common_cols], df_reputation[common_cols]), axis=0)

Step 7.2.1.: Process the ProductCost and ProductPrice columns to count the occurrences of low quality, high quality, not warranted, and warranted products

# Initialize lists to hold the counts
low_quality_count = 0
high_quality_count = 0
not_warranted_count = 0
warranted_count = 0
# Iterate through the ProductCost and ProductPrice columns to count the occurrences
for product_cost, product_price in zip(df_ads['ProductCost'], df_ads['ProductPrice']):
if isinstance(product_cost, str):
product_cost = ast.literal_eval(product_cost) # Convert string to list
if isinstance(product_price, str):
product_price = ast.literal_eval(product_price) # Convert string to list

# Count occurrences of 2 (low quality) and 6 (high quality) in ProductCost
if isinstance(product_cost, list):
low_quality_count += product_cost.count(2)
high_quality_count += product_cost.count(6)

# Count occurrences of 10 (not warranted) and 12 (warranted) in ProductPrice
if isinstance(product_price, list):
not_warranted_count += product_price.count(10)
warranted_count += product_price.count(12)

# Print the counts

Sampled columns Step 7 Illustration

Step 7.2.2.: Combine df_warrants and df_reputation into df_ads

# Strip spaces from column names and standardize the format
df_reputation.columns = df_reputation.columns.str.strip()
df_warrants.columns = df_warrants.columns.str.strip()

# Identify the common columns between warrants and reputation df
common_cols = list(set(df_reputation.columns) & set(df_warrants.columns))

# Combine all ads into a single dataframe
df_ads = pd.concat((df_warrants[common_cols], df_reputation[common_cols]), axis=0)

Step 7.2.3.: Add RatingIndicator to df_ads

# Initialize RatingIndicator
df_ads['RatingIndicator'] = 0

# Set RatingIndicator to 1 if the consumer has any good or bad ratings at all
df_ads.loc[(df_ads['GoodRatings'] greater than 0) | (df_ads['BadRatings'] greater than 0), 'RatingIndicator'] = 1

React Flow

Step 7.0.0: Rename columns appropriately

To ensure consistency between the df_reputation and df_warrants DataFrames, we rename columns to have the same names.

Step 7.0.1: Split participantID to PlayerID (consumer) and ProducerID (producer)

We split the participantID column into two new columns: Where PlayerID contains IDs of consumers (players) and ProducerID contains IDs of producers (sellers). This is done by filtering out the role column.

Step 7.1.: Process Warranted and create WarrantCount in df_warrants

We process the Warranted column in df_warrants to create a new column, WarrantCount, which counts the number of warranties for each round.

Extracting consumerCount from the treatment column gives us the number of consumers in the round. After that we generate WarrantCount based on the boolean values in Warranted, the list contains True then WarrantCount is updated.

Input: 
Warranted: [[True, False, True], [False, False, False]] consumerCount: [3, 2]

Output: 
WarrantCount: [[3, 0, 3], [0, 0, 0]]

Step 7.2.0: Create df_ads

We standardize column names and identify common columns between the two DataFrames. Then, we concatenate them into a single DataFrame, df_ads.

Step 7.2.1.: Process the ProductCost and ProductPrice columns to count the occurrences of low quality, high quality, not warranted, and warranted products

We count occurrences of the Low-quality products (ProductCost = 2), High-quality products (ProductCost = 6), Not-warranted products (ProductPrice = 10), Warranted products (ProductPrice = 12). We do this by counting the number of product cost and product price according to the above criteria.

Input:
ProductCost: ['[2, 6]', '[6, 2]']
ProductPrice: ['[10, 12]', '[12, 10]']

Output:
Low quality: 2, High quality: 2, Not warranted: 2, Warranted: 2

Step 7.2.3.: Add RatingIndicator to df_ads

We create a new column, RatingIndicator, which is set to 1 if a consumer has any good or bad ratings, and 0 otherwise. This is to check if any ratings were given

Input:
GoodRatings: [1, 0, 2]
BadRatings: [0, 0, 1]

Output:
RatingIndicator: [1, 0, 1]
Step 8:

Step 7.0.0: Rename columns appropriately​

Step 7.0.1: Split participantID to PlayerID (consumer) and ProducerID (producer)​

Step 7.1.: Process Warranted and create WarrantCount in df_warrants​

Step 7.2.0: Create df_ads​

Step 7.2.1.: Process the ProductCost and ProductPrice columns to count the occurrences of low quality, high quality, not warranted, and warranted products​

Step 7.2.3.: Add RatingIndicator to df_ads​