Understanding Process and Final Product Quality in a Flotation Plant

Content on this page is being actively developed and updated.

Flotation is a technique used to separate valuable minerals from gangue. It takes advantage of differences in how these materials behave on the surface. This method helps extract metals from low-grade and complex ores, making it important for various industries, including electronics and renewable energy.[link]

The dataset was a real industrial dataset and was accessible on Kaggle [link]. Each row in the dataset represents the plant’s operational state at a specific time. The data has 737,453 rows, recorded from March 10, 2017, to September 9, 2017.

The Python code for this project can be found here >>

By examining the first and last few rows of the dataset, I found that some values were duplicated across several columns, including the % Iron Feed, % Silica Feed, % Iron Concentrate, and % Silica Concentrate columns. This observation aligns with the data source’s description, which indicates that some columns were sampled every 20 seconds, while others were sampled hourly. 

To confirm the number of duplicates for all columns, I ran the nunique function and get the results below.

date                              4097
% Iron Feed                        278
% Silica Feed                      293
Starch Flow                     409317
Amina Flow                      319416
Ore Pulp Flow                   180189
Ore Pulp pH                     131143
Ore Pulp Density                105805
Flotation Column 01 Air Flow     43675
Flotation Column 02 Air Flow     80442
Flotation Column 03 Air Flow     40630
Flotation Column 04 Air Flow    196006
Flotation Column 05 Air Flow    194711
Flotation Column 06 Air Flow     90548
Flotation Column 07 Air Flow     86819
Flotation Column 01 Level       299573
Flotation Column 02 Level       331189
Flotation Column 03 Level       322315
Flotation Column 04 Level       309264
Flotation Column 05 Level       276051
Flotation Column 06 Level       301502
Flotation Column 07 Level       295667
% Iron Concentrate               38696
% Silica Concentrate             55569
dtype: int64

Since duplicate timestamps occurred at hourly intervals rather than every 20 seconds, all columns were aggregated to hourly intervals for further analysis.

1. Comparing Flotation Columns

At first, I intended to analyze the data based on median values. However, after plotting the boxplots for each column, it shows that the medians alone were not informative enough. Air flow medians were nearly identical across all columns, and level medians showed only two broad groupings (columns 1–3 vs. columns 4–7). Hence, I shifted focus to IQR (interquartile range) as a measure of variability, since the spread between Q1 and Q3 showed more differences across both air flow and level columns.

The two charts show that air flow and levels in the columns do not change in the same way.

Air Flow

There are 3 groups of pattern here:

  • Columns 1–3 had high, nearly identical variability, with a middle 50% range of about 50 units, indicating similar operating conditions. 
  • Columns 4 and 5 show almost no variability (IQR close to 0), suggesting that airflow was either consistently maintained or that the sensors were not capturing changes. We need to confirm this with the plant manager.
  • Columns 6 and 7 had moderate variability, around 30 and 17 units, respectively.
Levels

Generally, the variability in levels decreased as the process moved from one column to the next. However, the pattern changed in Column 3: instead of showing lower variability than Column 2, it shows the highest variability, nearly reaching 200 units. This finding needs to be reported to the plant manager to determine whether Column 3 is functioning properly or if it is intentionally designed to operate more variably.

See also  Ten Years of Quakes in Indonesia

2. Feed vs. Final Product Quality

% Iron Feed vs. % Iron Concentrate

There was a weak positive correlation between % iron feed and % iron concentrate output. Higher iron content in raw ore results in slightly more iron in the final product, but the relationship was not strong.

Distinct clusters are noticed in vertical banding patterns from the % iron feed axis, likely due to hourly sampling. Though the % iron feed spread out between around 43% and 66%, with most around 64%, % iron concentrate output remains stable between around 62% and 68%. This suggests that feed quality alone didn’t determine final product quality.

% Silica Feed vs. % Silica Concentrate

3. Tracking Performance Patterns Over Time

To capture the ‘normal’ pattern in the Floatation plant, I applied a moving average [link]. This statistic gives insight into the data to identify trends and anomalies over time.

Iron Concentrate Trend (7-Day Moving Average)

From the hourly data (represented by the gray line), I observed spikes reaching 68% and dips as low as 63%. However, the weekly moving average trend shows that the percentage of iron concentrate is generally between 64% and 66%, indicating a relatively flat overall trend. This suggests that while the process may be inconsistent at the hourly level, it operates within a general range over time.

The most notable jump is a climb in the trend line around June 2017. This is worth investigating. It may be due to the change in feed quality or different processing conditions during that period.

Silica Concentrate Trend (7-Day Moving Average)

The Silica Concentrate chart shows more noticeable oscillations, with hourly readings ranging from nearly 0% to 5.5% throughout the period. This variability makes Silica a more challenging variable to manage since higher silica levels are generally undesirable in iron ore output, as they indicate impurity.

The 7-day moving average for Silica starts relatively high, around 2.5% to 3.5%, from April through May. It then drops sharply to its lowest point, approximately 1.3% to 1.5%, in June before gradually rising back to 2.5% to 3% by September. This behavior may be linked to variations in feed material, such as seasonal ore quality, or a process adjustment made in June that was later reversed.

The dip in June stands out as the most favorable period, where both the trend and hourly readings were at their lowest. Since the objective is to minimize silica content, understanding what changed in June could be beneficial for the plant manager.

4. Measuring Process Efficiency

5. Performance by Time of Day

Recommendation to be reported to Plant Manager

Scroll to Top