05|写一个造假数据的工具类

156 阅读3分钟

开启掘金成长之旅!这是我参与「掘金日新计划 · 12 月更文挑战」的第9天,点击查看活动详情

1.背景

因为我们要测试es在大数据场景下的一些搜索实战场景,如果我们手动插入数据那数据量也是很大的,如果拿生产数据来进行测试学习,那也是不合理的,我们需要造一些假数据接近真实场景,之前就遇到一个很好的工具datafaker,用来生成一些接近真实数据的场景,接下来我们就用这个数据去造一些数据,用于我们后边的实验项目。

我们先来简单了解下datafaker

2.简单使用datafaker

我们先来简单了解下datafaker,这是它的官方使用文档www.datafaker.net/documentati…

官方提供了超过100个内置的数据列表,我们可以根据官方提供的这数据生成大量的测试数据,以下截图,只展示了一部分,我们可以根据需要到官方文档查询,找到合适的api生成我们需要的数据。

image-20221121140329695.png 假下来我们简单实用下datafaker,在pom.xml当中添加依赖
 <dependency>
      <groupId>net.datafaker</groupId>
      <artifactId>datafaker</artifactId>
      <version>1.6.0</version>
 </dependency>

我们先来演示下生成人名字,写一个循环随机生成10条假数据。

在com.daiyu.elastic.common.util下添加MockDataUtil.java工具类,用来编写我们生成假数据的方法。 我们先来演示下生成人名字,写一个循环随机生成10条假数据。

 public static void main(String[] args) {
        for (int i = 0; i < 10; i++) {
            Faker faker = new Faker();

            String name = faker.name().fullName(); 
            String firstName = faker.name().firstName(); 
            String lastName = faker.name().lastName(); 

            String streetAddress = faker.address().streetAddress();
            log.info("name:{},firstName:{},lastName:{},streetAddress:{}",name,firstName,lastName,streetAddress);
        }
    }

结果如下所示

name:Shannon Hettinger,firstName:Faviola,lastName:Satterfield,streetAddress:8468 Harold Center
name:Darius Bechtelar,firstName:Noel,lastName:Reinger,streetAddress:5019 Nikolaus Forges
name:Carley Green DVM,firstName:Nevada,lastName:Frami,streetAddress:3328 MacGyver Vista
name:Mr. Shenna Effertz,firstName:Margart,lastName:Beer,streetAddress:080 Carter Prairie
name:Valentine Prohaska,firstName:Kellye,lastName:Tillman,streetAddress:046 Greenfelder Green
name:Jessie Fisher,firstName:Russ,lastName:Skiles,streetAddress:28672 Beahan Route
name:Merrill Labadie Jr.,firstName:Billy,lastName:Heller,streetAddress:4532 Brakus Park
name:Jerald Marvin,firstName:Dia,lastName:Schiller,streetAddress:93120 Takako Turnpike
name:Oliver Schmitt,firstName:Sheryll,lastName:Bailey,streetAddress:85094 Mayert Ford
name:Rayford Smitham,firstName:Ariel,lastName:Lynch,streetAddress:139 Rutherford Pass

但是我们想生成中文的内容,我们只需要在生成Faker对象时候,传入Locale.CHINA这个参数就行了,其作用设置本地方言为中文。

 Faker faker = new Faker(Locale.CHINA);

加完之后,我们再次运行程序,结果如下

name:苏昊焱,firstName:雪松,lastName:卢,streetAddress:郑巷59308号
name:谭凯瑞,firstName:鸿煊,lastName:黄,streetAddress:顾桥963号
name:曾子默,firstName:昊焱,lastName:邓,streetAddress:万栋910号
name:阎越彬,firstName:瑞霖,lastName:赖,streetAddress:冯路8135号
name:毛笑愚,firstName:博文,lastName:阎,streetAddress:罗桥498号
name:高智宸,firstName:晓啸,lastName:马,streetAddress:崔桥8735号
name:韦楷瑞,firstName:修杰,lastName:丁,streetAddress:朱侬4号
name:朱志泽,firstName:弘文,lastName:范,streetAddress:丁中心900号
name:袁伟宸,firstName:鑫鹏,lastName:冯,streetAddress:朱巷01号
name:董瑞霖,firstName:绍辉,lastName:郝,streetAddress:阎旁6956号

datafaker只有部分数据可以生成中文的假数据,其他的都是英文的,如果我们想要使用自己定义的假数据,这个也是可以的,官方提供了相应的抽象类,我们可以继承它进行实现,接下来我们做一个小小的例子,演示自定义假数据。

@Slf4j
public class MockDataUtil {

    private static CompanyFaker companyFaker=new CompanyFaker(Locale.CHINA);

    public static class MyCompany extends AbstractProvider {
        private static final String[] COMPANY_NAMES = new String[]{"阿里", "腾讯", "百度", "字节","美团","米哈游","微软","大疆","华为","小红书","SHEIN"};

        private static Faker faker = new Faker();

        public MyCompany(Faker faker) {
            super(faker);
        }

        public String nextCompanyName() {
            return COMPANY_NAMES[faker.random().nextInt(COMPANY_NAMES.length)];
        }
    }

    public static class CompanyFaker extends Faker {
        public CompanyFaker(Locale locale) {
            super(locale);
        }
        public MyCompany companyName() {
            return getProvider(MyCompany.class, () -> new MyCompany(this));
        }
    }

    public static void main(String[] args) {
        for (int i = 0; i < 5; i++) {
            System.out.println(companyFaker.companyName().nextCompanyName());
        }

    }

结果如下所示

SHEIN
阿里
百度
腾讯
小红书

到此,datafaker的简单使用演示完成。

3.编写工具类生成假数据

3.1 所需要生成假数据的业务

在这里我们先编写一生成商品假数据的工具,我们先看下商品的实体类,包含哪些字段。

字段注释类型
ididLong
category分类String
basePrice标签基础价格BigDecimal
marketPrice实际销售价BigDecimal
stockNum库存数量Integer
skuImgUrl商品的图片链接String
skuId商品的squidString
skuName商品的名称String
createTime生成时间Date(yyyy-MM-dd HH:mm:ss)
updateTime更新时间Date(yyyy-MM-dd HH:mm:ss)

id可以结合原子自增类生成。

其中category我们可以生成数字,然后做枚举,大概捋了下商品分类的几个枚举类型。

枚举code枚举value
1电器
2家庭五金
3文具
4图书
5视频
6数码
7服装
8玩具
9体育健身
10食品

我们在com.daiyu.elastic.eunms下新建category枚举类CategoryEnums.java

@Getter
@AllArgsConstructor
public enum CategoryEnums {
    ELECTRIC(1, "电器"),
    HARDWARE(2, "家庭五金"),
    STATIONERY(3, "文具"),
    BOOK(4, "图书"),
    COSMETICS(5, "化妆品"),
    DIGITAL(6, "数码"),
    CLOTHING(7, "服装"),
    TOYS(8, "玩具"),
    PHYSICAL(9, "体育健身"),
    FOOD(10, "食品");
    private Integer code;
    private String name;

    public int getCode() {
        return code;
    }

    public void setCode(int code) {
        this.code = code;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public static CategoryEnums getByCode(int code) {
        for (CategoryEnums value : values()) {
            if (value.getCode() == code) {
                return value;
            }
        }
        return ELECTRIC;
    }
}

这个枚举类后边查询用,在实际生产中为了节省存储成本,一般标签分类相关的字段都用枚举code来替代,查询时候,在通过枚举value进行映射。

然后我们再看下剩下的字段

basePrice、marketPrice、stockNum这几个数字类型的字段我们需要随机生成,这个用Math.Random随机生成就好了。但是要注意,basePrice的值要大于marketPrice的值,stockNum的值要大于等于0。

skuImgUrl是生成图片短链接的,这个我们用datafaker生成的ip即可。

skuName跟skuid都是商品的属性,这个也是可以随机生成的,createTime跟updateTime可以根据给定的DataFormat给定的日期格式生成。

接下来我们编写生成工具类的代码如下。

@Slf4j
public class MockDataUtil {
    /**
     * 设置本地语言为中文
     */
    private static Faker faker = new Faker(Locale.CHINA);

    /**
     * 设置日期格式
     */
    private static String pattern = "yyyy-mm-dd HH:mm:ss";

    private static CategoryFaker categoryFaker=new CategoryFaker(Locale.CHINA);
    private static SkuFaker skuFaker=new SkuFaker(Locale.CHINA);

    public static String getMockValue(GoodsEnums goodsEnums) {

        switch (goodsEnums) {
            case CATEGORY:
                return categoryFaker.category().nextCategory();
            case SKUID:
                return faker.idNumber().peselNumber();
            case SKUNAME:
                return skuFaker.sku().nextSku();
            case SKUIMGURL:
                return faker.internet().image();
            case CREATETIME:
                return faker.date().birthday(pattern);
            case UPDATETIME:
                return faker.date().birthday(pattern);
            default:
                return "11";
        }
    }

    private static int getStockNum(){
        Random random = new Random();
        random.setSeed(10000L);
        return random.nextInt();
    }

    private static BigDecimal getPrice(){
        Random random = new Random();
        random.setSeed(10000L);
        return new BigDecimal(random.nextDouble());
    }

    public static class Category extends AbstractProvider {
        private static final String[] CATEGORYS = new String[]{"1", "2", "3", "4","5","6","7","8","9","10"};

        private static Faker faker = new Faker();

        public Category(Faker faker) {
            super(faker);
        }

        public String nextCategory() {
            return CATEGORYS[faker.random().nextInt(CATEGORYS.length)];
        }
    }

    public static class CategoryFaker extends Faker {
        public CategoryFaker(Locale locale) {
            super(locale);
        }
        public Category category() {
            return getProvider(Category.class, () -> new Category(this));
        }
    }

    public static class Sku extends AbstractProvider {
        private static final String[] SKUNAME = new String[]{"书架","毛巾","花瓶","跑步机","瓷器","电脑","吸尘器","艺术碳雕","山水壁画","微波炉","空调","枕头","床单","枕头套","被子","茶杯","饭碗","牙刷","牙膏","浴巾","洗脸巾","沐浴露","洗发露","纸","垃圾桶","勺子","筷子","锅","洗脸盆","洗洁精","香皂","洗衣粉","梳子","洗衣粉","花露水","手电筒","牙签","电器插座","锁","枕头","床单","枕头套","被子","茶杯","饭碗","牙刷","牙膏","浴巾","洗脸巾","沐浴露","洗发露","纸","垃圾桶","勺子","筷子","锅","洗脸盆","洗洁精","香皂","洗衣粉","梳子","洗衣粉","花露水","手电筒","牙签","电器插座","锁","靠垫","抱枕","果盘","相框","花瓶","放遥控板的小篮子","纸巾盒","烟灰缸","床头柜上的小饭盒","勺子","筷子","洗洁精","手表","随身听","笔记本电脑","包","常备药","维生素类","衣服架","夹 长","绳子","纸篓","创可贴","纱布","胶布&胶","双面胶","凉席","竹枕","各种型号电池","闹钟","台灯"};

        private static Faker faker = new Faker();

        public Sku(Faker faker) {
            super(faker);
        }

        public String nextSku() {
            return SKUNAME[faker.random().nextInt(SKUNAME.length)];
        }
    }

    public static class SkuFaker extends Faker {
        public SkuFaker(Locale locale) {
            super(locale);
        }
        public Sku sku() {
            return getProvider(Sku.class, () -> new Sku(this));
        }
    }


}

至此我们编写造假数据 的工具完成,如果大家还想增加其他的数据,请自行进行扩展。